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Preface 

This  volume  is  a  collection  of  articles  which  are  based  upon  lec¬ 
tures  presented  at  the  Twelfth  Science  Seminar  of  the  Air  Force 
Office  of  Scientific  Research,  held  at  Albuquerque,  New  Mexico,  in 
June  1967.  The  title  of  this  book,  JOURNEYS  IN  SCIENCE: 
SMALL  STEPS-GREAT  STRIDES,  which  was  also  the  theme  of 
the  seminar,  evolved  from  an  idea  expressed  by  Dr.  William  J. 
Price,  Executive  Director  of  AFOSR,  in  his  introduction  to  the 
book,  SCIENCE  IN  THE  SIXTIES,  a  publication  resulting  from 
the  Tenth  Anniversary  AFOSR  Science  Seminar  in  1965.  Dr.  Price 
said,  “What  has  emerged  vividly  is  the  interlocking  strength  of 
seemingly  disparate  avenues  in  the  sciences  today,  and  the  clear 
indication  that  great  progress  is  often  made  in  small  steps  occurrir0 
in  a  variety  of  ways.” 

This  book,  strictly  speaking,  is  not  a  publication  of  the  proceed¬ 
ings  of  the  seminar.  Instead  of  consisting  of  verbatim  transcripts 
of  the  lectures,  it  is  a  collection  of  articles  which,  in  almost  every 
instance,  were  written  especially  for  the  book.  Most  of  the  chapters 
represent  condensed  versions  of  the  two  lectures  presented  by  each 
of  the  distinguished  researchers  appearing  on  the  program.  The 
state  of  the  art  in  many  scientific  areas  embraced  by  the  interests 
of  the  Air  Force  Office  of  Scientific  Research  is  clearly  delineated 
in  this  collection. 

The  seminar,  sponsored  by  AFOSR,  was  held  in  cooperation  with 
the  University  of  New  Mexico  and  the  Air  Force  Special  Weapons 
Center. 

I  am  greatly  indebted  to  my  wife,  Rena,  who  assisted  in  reading 
the  manuscripts;  to  Mrs.  Opal  Broome,  Miss  Barbara  Malecki  and 
Mrs.  Wanda  Climenhaga  for  assistance  in  reading  proof  and  to  Ca¬ 
det  John  C.  Frost,  of  the  United  States  Air  Force  Academy,  who 
made  the  drawings  which  illustrate  Chapter  IX. 


Arlington ,  Virginia 
November,  1967 


D.  L.  A. 
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I.  Research  on  Research 

Derek  J.  de  Solla  Price 
DEDICATION 

I  should  like  to  dedicate  this  lecture  to  the  memory  of  that 
distinguished  friend  and  Yale  colleague,  our  “ flying  philos¬ 
opher,’’  Norwood  Russell  Fanson,  who  was  to  have  given 
this  keynote  address.  Russ  and  I  were  first  together  in  Cam¬ 
bridge,  England  from  1952  through  1956  where  we  enjoyed 
to  the  hilt  all  the  fun  of  pioneering  in  our  respective  lobes 
of  the  dual  disciplines  of  History  and  Philosophy  of 
Science.  His  death  on  18  April  1967  in  the  crash  of  his 
Grumman  F8F  Bearcat  plane  was  an  event  that  all  his 
friends  had  anticipated  daily,  much  to  his  delight,  for  the 
past  several  years.  The  rehearsal  does  not,  however,  take 
the  sting  from  the  way  in  which  we  now  miss  this  strangely 
versatile  and  creative  man. 

The  subject  for  these  small  but  cumulatively  significant  steps  of 
scientific  inquiry  which  we  shall  report  is  somewhat  unusual.  Most 
scientists  study  the  physical  world  around  them  and  its  animate  and 
inanimate  behaviors— they  are  biologists  and  physicists,  chemists  and 
astronomers.  The  target  of  our  curiosity  is  the  nature  of  science 
itself  and  the  behavior  of  the  scientist  in  seeking  it  out.  Why  and 
how  does  the  scientist  do  what  he  does?  Why  does  science  “come  out" 
in  just  the  way  it  does  and  not  in  some  other  way?  What,  in  short, 

DEREK  J.  DE  SOLLA  PRICE  is  Avalon  Professor  of  the  History  of 
Science  in  the  Department  of  the  History  of  Science  and  Medicine  at 
Yale.  He  has  pioneered  the  scientific  analysis  of  the  modem  growth  and 
organisation  of  scientific  manpower  and  literature.  Such  analysis  is 
now  being  applied  to  the  historical  understanding  of  the  development 
of  science  and  also,  by  several  countries,  to  considerations  leading  to 
policy-making  decisions  about  science.  At  Yale  University  he  was 
Chairman  of  the  new  Department  of  the  History  of  Science  and  Medi¬ 
cine  which  has  now  become  one  of  the  largest  such  centers  in  the 
world  for  this  activity. 
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to  use  the  scientists’  own  metaphor,  makes  science  tick?  How  is  it, 
indeed,  that  a  succession  of  single  steps  can  cumulate  in  scientific 
knowledge  in  a  way  that  seems  to  reach  farther  and  faster  than  non- 
scientific  knowledge? 

As  a  preamble,  because  of  the  special  character  of  this  research,  it 
is  necessary  to  cast  a  glance  at  what  George  Sarton  once  called  the 
Parascientific  Professions,  the  academic  specialties  of  the  history  and 
the  philosophy  of  science,  together  with  the  more  recent  accretions 
of  sociology  and  economics  of  science,  the  psychology  of  scientists 
(and  their  psychiatry,  insofar  as  that  is  special).  Perhaps  one  ought 
also  to  include  the  fields  of  research  management  studies  and  science 
policy  studies,  both  waxing  more  and  more  diverse  and  ever  strong¬ 
er  with  each  growth  of  science  in  big  business  and  bigger  govern¬ 
ment. 

The  whole  complex,  together  with  the  sort  of  material  I  shall 
display  today,  is  sometimes  called  “science  of  science”;  I  dislike  the 
name,  "  .ence  the  experiment  with  the  title  of  this  chapter.  Never¬ 
theless,  reduplication  of  names  seems  to  have  gone  out  of  fashion  in 
science  since  the  day  when  Galileo  Galilei  was  christened.  My  own 
predilection  for  this  type  of  intellectual  labor,  rather  than  any 
other,  stems  from  a  desire  to  see  all  parts  of  the  research  fronts  of 
all  of  the  sciences;  while  wooing  the  muse  of  history,  I  never  quite 
managed  however  to  forget  that  I  had  been  a  physicist.  I,  therefore, 
have  always  had  a  strong  temptation  to  try  to  understand  the  work¬ 
ings  of  science,  not  in  the  usual  humanistic  way,  but  with  the  hard 
and  precise  quantitative  and  mathematical  tools  of  the  physicists’ 
modes  of  thought  and  comprehension. 

The  urge  is  one  thing,  but  the  means  of  fulfilling  it  is  another. 
Quantitative  thought  is  quite  impossible  without  a  goodly  stock  of 
data.  When  Bernal  wrote  on  the  social  organization  of  science  in 
the  late  1930’s,  there  were  hardly  any  accurate  data  at  his  command 
to  tell  how  many  scientists  there  were  and  how  much  of  what  science 
was  being  done  in  what  country.  Now,  well  into  the  age  of  atoms 
and  space,  and  thanks  to  the  governmental  pre-occupation  for  ac¬ 
counting  for  its  funding  and  manpower,  we  have  compendious 
statistics  from  several  countries  including  the  prodigious  blocks  of 
data,  machine  handled  and  issued  in  bulk  by  all  the  Federal  a«< -ni¬ 
cies  of  the  United  States  Government.  Then  again,  thanks  to  die 
need  to  look  after  the  giant  rise  of  scientific  literature  and  die  swol- 
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len  scientific  societies  and  their  journals  we  have,  often  quite  in¬ 
cidentally  to  the  main  function  of  management,  a  storehouse  of 
data  that  never  was  available  anywhere  until  about  a  decade  ago. 

In  some  respects  the  supply  of  data  is  an  embarrassment  here  as 
it  has  been  in  meteorology.  It  takes  a  great  deal  of  data  to  make 
even  a  weak  theory,  and  it  takes  a  great  deal  of  theory  before  you 
can  explain  any  small  quantity  of  the  data  with  a  sufficient  under¬ 
standing  to  make  little  predictions.  The  situation  also  resembles 
rather  strongly  the  stage  of  astronomy  in  the  period  of  Tycho  Brahe, 
which  is  very  familiar  to  the  historian  of  science  as  a  key  case  study. 
A  mass  of  data  has  been  accumulated;  we  have  begun  to  make  the 
empirical  generalization?  and  projections  comparable  to  the  Kepler 
Laws  of  planetary  motion.  The  present  stage  must  be  a  continuation 
of  both  efforts  together  with  the  first  faltering  steps  towards  those 
Newtonian  Laws  that  will  explain  exactly  why  the  Kepler  motions 
behave  as  they  do. 

To  develop  an  understanding  of  this  sort  is  difficult.  It  involves 
the  necessity  of  being  not  only  literate  but  also  numerate.  We  shall 
have  to  think  in  numbers.  It  is  made  all  the  more  difficult  by  the 
fact  that  the  scientists  here  become  their  own  experimental  animals 
in  a  Haldane-like  fashion,  and  they  suspect  violently,  sometimes 
almost  psychotically,  all  attempts  to  evaluate  them  and  their  work 
in  the  way  that  they  evaluate  the  universe.  In  this  case  the  suspi¬ 
cions  are  at  least  partially  reasonable,  for  the  basic  methodology 
bears  a  nasty  resemblance  to  what  I  am  told  is  the  Texas  method  of 
hog-weighing.  A  hog  is  put  at  one  end  of  a  seesaw  board;  stones  are 
piled  on  the  other  end  till  the  board  balances,  then  you  guess  the 
weight  of  the  stones. 

The  procedure  in  this  work  has  been  to  use  headcounts  of  scientif¬ 
ic  manpower,  published  papers  and  journals,  patents  and  money, 
and  then  to  suppose  that  these  quantities  measure  in  some  mysteri¬ 
ous  way  the  ''size"  of  science.  It  is  perhaps  not  so  bad  as  it  sounds. 
Most  of  the  measurements  show  a  very  strong  statistical  regularity 
and  they  oliey  with  remarkable  precision  their  equivalents  of  the 
Kepler  Laws.  Science,  measured  in  this  sort  of  way,  loose  though  it 
seems,  is  much  more  regular  in  its  behavior  than  must  other  measur¬ 
able  characteristics  of  our  civilization.  Science  is,  so  to  speak,  a  much 
more  regular  thing  in  its  behavior  than  are  jieople. 

Perhaps  the  best  example  of  tliis  regularity  is  to  be  seen  in  the 
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First  Law  of  Research  on  Research  (R  on  R).  This  law  states  that  if 
you  measure  the  size  of  science  as  a  function  of  time,  the  progress  is 
seen  to  be  a  regular  exponential  increase,  holding  for  periods  as 
long  as  200  years,  and  maintaining  a  rate  of  compound  interest 
growth  of  the  order  of  a  doubling  every  10  to  15  years,  a  multiplica¬ 
tion  by  a  factor  of  10  two  or  three  times  during  each  century.  One 
gets  about  the  same  rate  whether  you  count  men,  or  scientific 
journals,  or  the  papers  published  in  them.  Rates  vary  only  a  little 
from  field  to  field  of  science,  from  country  to  country. 

To  push  this  example  to  the  next  stage,  beyond  being  merely 
empirical,  we  must  ask  two  sorts  of  questions.  Firstly,  why  i-  the 
regular  growth  exponential  and  how  can  one  compute  its  rate  of 
growth  from  other  parameters?  Secondly,  given  this  law  of  growth, 
what  othei  consequences  flow  from  it,  and  how  will  it  relate  to  other 
aspects  of  science  which  may  be  measurable?  The  first  question, 
demanding  a  knowledge  of  any  underlying  mechanism,  has  an  ob¬ 
vious  lead:  to  a  mathematician  it  is  obvious  that  exponential  growth 
follows  as  a  solution  of  the  differentia!  equation,  dx/dt  =  kt;  that  is, 
the  situation  holds  where  the  rate  of  growth  is  somehow  held  in 
constant  proportion  to  the  size  of  the  population  attained  Scientists 
beget  more  scientists  and  old  knowledge  brings  forth  new  knowledge 
at  a  constant  rate.  This  is  true,  it  seems,  for  all  knowledge  and  all 
types  of  people,  but  for  science  and  scientists  there  appears,  as  we 
shall  show  later,  to  be  some  special  research  front  structure  that 
lends  to  this  activity  a  growth  rate  far  in  excess  of  all  the  other 
activities  of  mankind. 

The  second  question,  asking  the  consequences  of  this  mode  of 
growth,  has  some  rather  famous  answers.  Among  these  consequences 
are  the  facts  that  most  of  the  scientists  that  have  ever  lived  are 
alive  now,  and  that  most  of  these  scientists  alive  happen  to  be 
rather  young  persons,  It  is  quite  easy  to  prove  these  alarming  facts. 
Alive  now  there  exist,  roughly  speaking,  all  scientists  who  were 
born  between  20  and  05  years  ago.  At  a  minimal  estimate  this  is 
about  three  of  our  periods  during  which  science  doubles  in  size. 
Therefore,  for  every  man  born  before  the  first  of  these  |>criods  there 
will  be  one  more  from  the  first  doubling,  two  from  the  second  and 
four  from  the  last.  Thus,  out  of  every  eight  scientists  that  have 
ever  lived  seven  are  alive  right  now,  four  of  them  being  fellows 
who  have  come  into  the  business  during  the  last  fifteen  years.  It 
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follows  then  that  87i/2%  of  all  the  scientists  who  have  ever  beer,  (say, 
90%)  are  alive  now,  and  of  these  more  than  half  have  been  in  the 
occupation  for  less  than  12  years. 

Although  the  argument  is  so  simple,  the  facts  give  a  rather  deep 
explanation  of  some  of  the  more  pervasive  characteristics  of  science 
as  noted  by  historians  and  sociologists.  Science  has  a  persistent 
quality  of  immediacy  and  juvenility  Not  only  is  it  now  true  that 
most  of  the  scientists  are  young,  ant  half  of  all  we  know  has  been 
found  out  in  the  last  decade  or  so,  but  this  has  always  been  true. 
Since  «t  is  a  consequence  of  the  exponential  growth  that  has  held 
sway  for  the  last  two  or  even  three  centuries,  Ben  Franklin  and 
perhaps  Isaac  Newton  could  have  said  just  the  same  things  about 
how  much  is  new  and  how  much  science  was  a  product  of  their  day 
and  age  as  it  never  had  been  before.  Science  runs  so  much  faster 
than  people,  so  much  more  rapidly  than  civilization. 

This  is  by  no  means  as  acute  as  the  situation  is,  in  fact.  Men 
come  into  the  field  and  new  ideas  come  on  the  scene;  some  men 
leave  the  field  and  some  ideas  disappear  to  the  limbos  of  rejection 
and  obsolescence.  By  any  reasonable  measure  it  can  be  shown  that 
the  death  rate  is  of  the  same  order  of  magnitude  as  that  of  overall 
increase.  Since  the  general  growth  rate  is  of  the  order  of  7%  per 
annum,  it  follows  that  this  comes  about  through  a  birth  rate  of 
something  like  15%  per  annum  combined  with  a  death  rate  that  is 
also  7%  or  8%  per  annum.  The  situation  is  rather  similar  to  that 
of  a  very  primitive  village  where  all  the  women  arc  giving  birth 
almost  every  year,  hut  the  state  of  medicine  is  such  that  an  appreci¬ 
able  segment  of  the  population,  adults  as  well  as  children,  dies  every 
year.  The  result  is  that  the  village  contains  a  quite  smalt  core  of 
adults  who  provide  the  labor  force  and  who  have  hap|>cncd  to 
have  survived  so  long,  together  with  a  large  bulk  of  young  chil¬ 
dren  who  are,  so  to  s|)eak,  just  passing  through  this  existence. 

It  turns  out  that  in  such  a  village,  or  to  come  back  to  our  scientists, 
in  the  population  of  science,  most  of  the  practitioners  are  newly  at 
it,  but  about  half  the  man-yettr*  of  work  arc  due  to  that  very  small 
stable  bunch  of  individuals  who  hap|>eued  to  have  survived  the  quite 
enormous  mortality.  In  hard  numbers,  whichever  way  measurements 
are  taken,  the  size  of  the  hard  core,  resjronsible  for  about  half  the 
man-years  of  work,  half  the  papers,  turns  out  to  be  about  the 
square  root  of  the  size  of  the  total  population.  The  large  number  of 
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children  “just  passing  through”  have  their  analogues  in  science 
too.  In  this  age  of  heavy  financial  dependence  of  young  scientists 
on  old,  the  young  become  the  assistants  and  collaborators  of  the 
senior  men,  and  it  is  this  process  that  accounts  for  the  rapidly-in¬ 
creasing  amount  of  collaborative  authorship  in  scientific  papers.  In 
1870  it  was  very  rare  for  a  scientific  paper  to  have  a  pair  of  authors 
instead  of  just  one  claimant  of  intellectual  property.  By  1970  we 
shall  be  well  on  the  way  to  complete  extinction  of  the  single¬ 
author  paper  in  the  most  heavily  pressed  and  financed  scientific 
fields  and  we  shall  even  be  moving  towards  the  phenomenon  of  an 
almost  infinite  number  of  authors  on  each  of  the  almost  infinite 
number  of  papers.  All  the  subsidiary  authors,  just  passing  through, 
are  part  of  the  rat-race  of  modern  science;  their  names  swell  the 
list  of  American  Men  of  Science  and  in  the  next  edition,  so  many 
of  the  names  will  be  gone,  others  will  have  taken  their  place,  and 
only  the  small  stable  square-root  population  will  remain  as  a  core 
of  “known"  people. 

Again  in  the  spirit  of  our  inquiry  there  are  two  ways  of  proceed¬ 
ing  in  this  case;  firstly  we  can  ask  for  a  “Newton's  Law"  governing 
the  rat-race,  secondly  we  can  ask  for  the  practical  consequences  over 
and  above  those  that  have  just  been  mentioned.  Perhaps  the  chief 
of  these  consequences  is  that  one  can  quite  easily  show  by  inves¬ 
tigating  the  minor  characters,  that  collaboration  is  not  a  jxMiling  of 
talents  to  produce  better  work  or  different  work;  in  most  cases  it 
must  be  a  device  for  getting  papers  out  ol  jieople  who  essentially 
seem  to  have  less  than  a  half  |>apcr  in  them.  The  number  of  such 
)>eople  is  far  in  excess  of  twice  those'  with  a  whole  paper  in  them;  so 
by  using  this  lesser  quality  manpower,  one  cat:  more  than  double 
the  output  of  scientific  work.  All  that  is  needed  is  a  sufficiency  ol 
able  and  senior  people  to  lead  (lie  teams  and  a  sufficiency  of  iiYcen- 
live  to  make  the  large  numbers  of  lesser  men  aspire  readily  for  the 
chance  of  surviving  to  Income  one  of  the  great  and  old. 

T(»  cool  the  situation  a  little  I  strongly  suggest  we  determine  to 
award  credit  for  authorship  at  the  rate  of  l/n  of  a  point  for  each  of 
n  authors  of  such  a  collaborative  |W|>er.  It  follows  quite  well  from 
our  lasvs  that  they  do  not,  on  an  average,  deserve  more  than  this. 
Certainly  they  do  not  deserve  a  whole  point  each. 

The  exact  taw  from  which  this  all  comes,  the  law  of  the  distribu¬ 
tion  of  quality  among  men,  (lapers,  journals,  institutions,  etc.,  might 
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be  called  the  Second  Law  of  Research  on  Research.  It  turns  out  that 
the  distribution  in  all  cases  is  rather  similar  to  the  Pareto  Law  of 
the  distribution  of  economic  incomes  within  a  country.  It  is  also 
similar  to  the  Zipf  law  which  governs  the  sizes  of  cities  within  a 
territory,  or  the  length  of  words  or  sentences  in  a  sample  of  lan¬ 
guage.  All  of  these  laws  are  mathematical  approximations  that  are 
roughly  equivalent  one  to  the  other,  being  lognormal  or  almost  in¬ 
verse  square  law  distributions.  They  are  very  assymetricai  and  quite 
unlike  the  usual  Gaussian  type  distributions  that  arc  of  universal 
applicability  to  such  entities  as  the  height  and  weight  and  intel¬ 
ligence  of  large  populations.  ' 

One  way  of  looking  at  such  distributions  is  to  compare  them  with 
die  case  of  populations  of  soap  bubbles.  Such  bubbles  have  the 
curious  property  that  when  two  meet  and  come  into  interaction,  the 
little  soap  bubble  blows  up  the  big  one.  In  most  other  castt  of 
physical  interaction  it  is  the  big  that  gives  to  the  small  so  thr.t  the 
universe  slowly  evens  out  on  its  way  to  that  great  heat  death  of  the 
entropy  law.  With  soap  hubbies  and  with  scientists  the  sign  of  the 
interaction  is  the  other  way  round,  and  the  statistics  of  the  popula¬ 
tion  tend  to  a  condition  in  which  most  of  the  spice  is  occupied  by 
a  very  few  very  large  hubbies.  Correspondingly,  one  gets  most  of  the 
scientific  work  done  by  a  very  few  very  clever  and  productive  sci¬ 
entists.  though  a  very  large  number  of  lesser  characters  are  striving 
for  this  eminence  and  dying  like  Hies  on  the  way. 

To  lie  a  little  more  exact  about  this  distribution,  it  happens 
numerically  for  men  as  for  institutions  that  the  chance  of  doubling 
the  size  of  the  achievement  is  uniformly  about  one  in  four,  no  mat¬ 
ter  what  the  size  already  attained.  The  chance  that  you  will  publish 
ten  more  papers  if  you  already  have  published  ten  is  about  1  in  T. 
the  chance  of  doing  a  second  jrapei  if  you  have  only  so  far  done 
your  fust  is  1  in  4.  The  mortality  is  always  the  remaining  three 
chances  out  of  four*,  always  die  mortality  is  due  to  this  soap-bubble 
principle;  the  more  eminent  amt  the  more  experienced  tend  to  get 
results  before  those  of  less  degree.  Science  happens  to  fie  such  that 
there  seems,  after  all,  to  tie  only  the  one  world  to  discover.  Jf  the. 
constant  is  discovered  try  Piauck  there  cannot  tie  this  discovery 
again;  it  has-been  made.  If  Iteethnveit  had  not  existed,  *-dier  men 
would  have  written  quite  different  symphonies:,  iteeihe  ten's  private 
|uu|Kity  is  unmistakable.  If  Planck,  however,  trad  trot  made  this 
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particular  discovery,  somebody  else  would  have  had  to  have  made 
it,  and  the  indications  are  to  the  historian  of  modem  science  that 
the  somebody  else  would  have  made  the  discovery  rather  quickly. 
Most  discoveries  worth  making  are  indeed  made  by  many  more 
than  a  single  person  so  that  the  syndrome  of  disputed  priority  and 
subsequent  contest  for  recognition  is  one  of  the  most  common  within 
history.  Merton  has  considered  this  soap-bubble  type  of  action  from 
the  standpoint  of  the  sociologist  and  for  him  it  has  become  the 
Biblical  Matthew  Principle,  “unto  him  that  hath  shall  be  given, 
and  unto  him  that  hath  not  shall  be  taken  away,  even  diat  which  he 
hath.”  Science  is  a  quite  grim  battleground. 

At  this  point  we  must  backtrack  a  little  to  consider  the  basic 
mechanism  by  which  science  cumulates  and  is  enabled  to  grow 
exponentially  at  a  rate  so  mudi  more  surefooted  and  rapid  than 
mere  non-science.  Our  knowledge  about  this  comes  as  an  incidental 
benefit  from  the  production  of  the  Science  Citation  Index,  a 
quarterly  and  annual  ongoing  publication  in  which  you  can  look 
up  any  paper  that  has  ever  been  published  in  the  past  and  find 
out  who,  if  anybody,  has  been  citing  it  as  a  reference  in  their  more 
recently  published  work  in  more  than  1000  of  the  world’s  leading 
scientific  journals  in  all  fields.  The  great  advantage,  of  course,  is 
that  this  is  the  only  sort  of  index  that  runs  forward  in  time  instead 
of  backward  to  older  and  older  material;  it  is  also  the  only  sort  of 
subject  matter  categorization  that  does  not  let  a  cataloguer,  librarian, 
or  other  “expert”  intervene  I'etween  the  generator  and  the  seeker 
of  information. 

Each  of  these  annual  citation  indexes  carries  data  on  millions  of 
references  and  we  have  been  most  fortunate  in  being  able  to  use  the 
results  of  various  machine  sortings  and  countings  to  determine  in 
just  what  sort  of  way  papers  are  tied  together  by  their  practice  of 
citing  each  other.  It  turns  out  that  there  are  two  maLi  variables 
involved  The  first  of  these  is  the  amount  of  citation,  the  number  of 
references  cited  by  each  paper  on  the  average.  Throughout  the 
whole  of  science  the  general  average  is  of  the  order  of  about  ten 
references  per  paper.  If  any  field  carries  far  less  than  this,  say  one 
or  two  or  none  at  all,  the  presumption  is  that  this  is  not  an  area 
where  papers  are  built  on  the  foundation  of  previous  papers.  Quite 
often  in  technological  or  professional  magazines  one  finds  this  sort 
of  article,  serving  for  news  value  or  some  other  purpose,  rather  than 
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that  of  making  a  contribution  to  scholarship.  Then  again  if  there 
are  many  more  than  ten  references,  say  thirty  or  forty, 'the  presump¬ 
tion  must  be  that  this  again  is  a  beast  of  a  different  kind,  probably 
a  reView  type  of  paper  summarizing  all  recent  previous  work  in  a 
given  field. 

A  second  variable  concerns  tire  way  in  which  those  references  that 
exist  actually  lie  the  new  papers  to  the  old.  For  highly  non-scientific 
fields  it  happens  that  any  piece  of  material  drat  has  ever  been  pub¬ 
lished  has  about  an  equal  diancc  of  being  cited,  no  matter  what  its 
date.  Papers  of  all  dates  may  be  very  good  or  very  bad,  but  the  mix 
seems  about  the  same  for  all  dates,  ancient  and  modern.  With 
science  it  is  quite  different:  as  Heisenberg’s  father  once  said,  each 
paper  conies  with  only  three  months’  guarantee.  Very  recent  papers 
or  even  those  so  new  drat  they  have  hardly  been  circulated  in 
preprint  form,  let  alone  published,  are  often  much  more  valuable, 
more  frequently  cited  than  papers  which  have  been  around  for 
enough  time  so  that  the  good  juice  has  already  been  squeezed  from 
them  and  everybody  knows  the  material  anyway.  For  normal 
scientific  fields  the  citations  to  papers  published  during  the  last 
five  years  is  about  30%.  For  the  hardest  scientific  disciplines  such 
as  molecular  biology  and  theoretical  high  energy  physics,  where 
scientists  are  treading  hard  on  each  others  heels  (and  heads),  the 
proportion  may  rise  to  as  high  as  70%.  For  a  few  peculiarly  archival 
fields  such  as  zoological  and  botanical  taxonomy,  where  older  arche¬ 
types  ar"  preferred,  the  proportion  of  recent  work  may  be  as  low 
as  15%, 

The  Hardest  science  grows  from  a  very  thin  skin,  whereas  ordinary 
scholarship  grows  from  the  body  of  its  knowledge.  By  growing  from 
the  skin  alone,  the  prolifcralion-is  concentratedly  attached  to  a  few 
papers  instead  of  being  diffusely  related  to  so  many.  Unfortunately, 
we  do  not  have  any  decent  mathematical  formulation  of  the  statis¬ 
tical  properties  of  networks  of  this  sort,  and  only  in  a  bumbling 
way  have  we  been  able  to  divine  and  guess  the  various  interrelations. 
In  a  few  cases,  where  a  sample  subject  can  be  isolated  anti  is  small 
enough  to  consider  in  detail,  one  can  draw  a  complete  diagram  of 
the  structure.  It  turns  out  from  this  that  science  has  a  research 
front  something  like  forty  papers  deep;  anything  older  than  this 
tends  to  be  packed  down  into  reviews  arid  then  into  textbooks— if 
it  lasts  at  all.  Most  contributions  exist  only  for  the  purpose  of  en- 
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abltng  some  other  small  and  current  advance  to  be  made— once 
this  has  been  made  the  paper  that  spurred  it  becomes  useless.  It  is 
not  intended  to  be  packed  down  and  become  archival,  even  though 
die  strong  superstition  of  scientists  is  that  publication  is  somehow 
a  sacred  duty  in  order  to  read  the  contribution  into  this  perpetual 
and  immortal  archive. 

The  mythology  of  an  archive  is  something  that  runs  deep  in  the 
life  of  science  and  is  worthy  of  the  hardest  analysis.  It  is  a  most 
intriguing  paradox  that  the  scientist  secures  the  maximum  in  private 
intellectual  property  by  the  device  of  the  most  open  publication. 
Publication  is  the  key  to  it.  Papers  and  journals  exist  as  a  medium 
for  rapid  publication,  preferably  together  with  the  conferring  of  an 
aura  of  status  and  approval  through  stature  of  the  journal  and 
discretion  of  the  esteemed  editors.  There  is  little  evidence  that  such 
journals  and  papers  are  actually  read  as  a  means  of  transmitting 
the  scientific  information  so  printed.  That  knowledge  has  already 
been  sent  around  the  research  front  by  private  circulation  and  in¬ 
formal  means.  The  open  publication  is,  of  course,  scanned  to  see 
who  got  what  in  this  week,  and  it  is  naturally  the  means  whereby 
graduate  students  can,  by  hard  reading,  reach  the  research  front. 
It  would  be  no  good  at  all  if  science  was  able  to  run  so  fast  that 
embryonic  new  scientists  could  never  catch  up  with  the  advancing 
front. 

Scientists,  it  seems,  are  those  who  are  highly  motivated  to  publish 
but  not  to  read.  Interestingly  enough  it  is  a  totally  different  situa¬ 
tion  in  the  technologies.  One  might  almost  define  a  technology  as 
a  field  where  the  chief  intended  product  is  an  object,  a  manufacture, 
a  process,  a  chemical,  rather  than  a  paper.  Publication  there  might 
be  (probably  in  simulation  of  the  approved  sciences)  but  it  is 
readily  seen  to  be  epiphenomenai.  Technologists  do  not  want  to 
publish  usefully— there  is  no  tradition  of  giving  one’s  competition 
a  useful  lead— but  they  want  very  much  to  read  in  case  somebody 
cist  has  let  slip  a  lead  out  of  which  they  think  tney  should  be  able 
to  get  some  useful  advance  of  practical  significance.  For  the  most 
part,  such  literature  crisis  as  is  often  discussed  is  an  artificial  con¬ 
struct  on  the  part  of  technologists  who  believe  there  is  some  sort  of 
useful  scientific  information  archive  to  which  they  have  only  a  most 
difficult  access.  In  fact  the  sort  of  material  they  want  is  not  published 
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at  all.  The  scientist  is  concerned  with  the  publication  of  quite  dif¬ 
ferent  material. 

The  most  surprising  revelation  from  such  an  analysis  is  that  if 
one  defines  science  and  technology  in  these  sorts  of  terms  instead 
of  the  rather  weak  and  naive  definition  by  intent  that  we  usually 
use,  it  becomes  most  perplexing  to  analyze  any  relation  that  may 
exist  between  various  parts  of  science  and  various  parts  of  tech¬ 
nology.  So  far  as  I  have  been  able  to  take  this  analysis,  science  and 
technology  are  separate  and  almost  independent  cumulations.  Old 
science  breeds  new  science  at  constant  rate  through  the  medium  of 
a  knitted  network  of  literature,  and  analogously  old  technology 
breeds  through  an  interwoven  net  which  consists  not  of  an  embodied 
literature,  but  of  a  knowhow  that  is  only  partially  recorded  in  such 
things  as  patents  and  advertisements  and  the  trade  catalogues. 
Between  the  two  networks  there  is,  I  think,  what  the  physicist  would 
call  a  “weak  interaction”,  just  sufficient  to  keep  the  two  cumulations 
in  step.  Only  atypically,  in  those  rare  traumatic  incidents  that 
become  legend,  like  transistors  and  penicillin,  does  a  piece  of  new 
science  immediately  or  quickly  give  rise  to  a  new  advance  in  tech¬ 
nology.  For  the  most  part,  transfer  from  one  to  the  other  is  carried 
in  people  who  migrate  from  one. side  to  the  other  in  course  of 
training  or  saift  of  jobs.  At  all  events,  it  is  utterly  wrong  to  con¬ 
ceive  of  technology  as  being  equivalent  to  applied  science. 

Indeed,  in  the  matter  of  interaction  between  science  and  tech¬ 
nology  then  is  even  greater  complication.  Some  technology  may 
not  be  related  to  science  at  all.  Quite  a  lot  of  steam  engine  tech¬ 
nology  had  nothing  whatsoever  to  do  with  thermodynamics,  and 
quite  a  loi  of  engineering  goes  like  the  invention  of  zip-fasteners 
and  safety  pins,  through  a  home  inventor  or  bicycle  shop  mechanic 
mechanism  rather  than  through  anything  that  could  be  recognized 
as  "scientific”  training,  A  very  large  part  of  the  activity  which  is 
now  traditionally  called  “development”  and  even  “applied  research” 
consists  of  what  a  manufacturer  must  do  in  order  to  try  out  and 
start  making  a  new  product.  If  such  trying  and  starting  uses  any 
considerable  quantity  of  scientific  training,  I  suppose  it  should  be 
called  research,  but  if  it  runs  quite  close  to  manufacture,  say  as  a 
pilot  operation,  it  would  seem  better  to  include  it  as  part  of  the 
process  and  necessary  expense  of  production. 
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Some  light  on  this  central  problem  of  the  science  of  science  may 
be  had  from  some  new  research  into  the  economics  of  science,  here 
presented  for  the  first  time.  I  might  add  that  this  seems  to  be  a 
good  case  of  simultaneous  discovery;  several  of  us  in  several  coun¬ 
tries  have  hit  upon  much  the  same  idea  at  the  same  time  and, 
naturally,  each  of  us  feels  that  he  is  the  true  begetter  and  originator 
of  the  notion.  We  have  all  found  a  rather  interesting  regularity  in 
a  sort  of  data  where  regularities  worth  talking  about  have  never 
been  noted. 

To  put  it  in  a  nutshell,  the  issue  is  why  the  United  States,  for 
example,  seems  to  produce  about  one  third  of  all  the  world’s 
physics  and  chemistry  and  most  other  sciences.  Why  is  the  share 
one  third  and  not,  say,  about  6%  which  is  its  share  of  the  world 
population,  or  90%,  or  anything  else?  About  a  year  ago  one  of  the 
most  famous  grand  old  men  of  science  in  the  USSR  asked  why  the 
USSR  was  only  about  half  the  size  of  USA  in  scientific  output  of 
papers  in  all  fields  even  though  the  statistics  showed  that  the  num¬ 
ber  of  scientists  in  the  two  countries  were  roughly  comparable  and 
certainly  not  different  by  a  factor  of  two.  Why  do  Canada  and  India 
each  publish  about  2%  of  the  world’s  science  though  India  has  a 
population  25  times  greater  than  that  of  Canada?  Not  only  do  they 
publish  about  equally,  but  also  their  governmental  and  professional 
societies  and  organizations  are  of  about  the  same  size,  structure  and 
complexity,  though  of  course  different  as  are  the  countries  in  their 
politics  and  philosophies.  Science  has  to  be  universal.  There  is  only 
one  game  of  physics  to  play  and  it  makes  little  difference  whether 
one  approaches  it  as  one  religion  or  another,  one  politics  or  another. 

Our  discovery  is  simply  that  one  gets  a  feeling  that  the  size  of 
each  country’s  scientific  effort  is  proportional,  not  merely  to  the 
size  of  its  population,  but  also  to  something  like  the  per  capita 
wealth  of  that  nation.  Now,  if  one  multiplies  population  by  wealth 
per  unit  population  one  gets  total  wealth.  We  come,  therefore,  to  the 
conclusion  that  at  any  instant  of  time  the  several  scientific  outputs 
of  the  nations  of  the  world  should  be  proportional  to  the  Gross 
National  Product  (or  something  like  it)  of  each  nation.  It  is  not 
worth  refining  the  theory  to  see  which  of  the  several  definitions  of 
GNP,  personal  income  etc.,  should  be  used— they  are  all  equivalent 
to  the  extent  of  this  rough  computation.  To  put  the  result  in  the 
terms  in  which  we  posed  the  question,  the  USA  is  about  one  third 
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of  the  world's  science  because  it  is  also  about  one  third  of  the  world’s 
wealth. 

In  Table  I  we  have  set  the  best  available  figures  for  the  numbers 
of  papers  in  Physics  Abstracts  and  Chemical  Abstracts  alongside 
those  for  the  wealth  and  the  populations  of  the  chief  countries  of 
the  world  that  play  a  significantly  large  role  in  publishing  the  sci¬ 
ence  that  is  our  common  stock.  It  will  readily  be  observed  without 

TABLE  1 

NATIONAL  BROWNIE  POINTS 
Country  .  Percentages  of  World  Total 

Share  of 

Share  of  GNP  Share  of  Phys.  Share  of  Chem.  Population 
1964  Abstr.  1961  Abstr.  1965  1964 


(i)  Larger  participants:— 
USA 

32.8 

31.6 

28.5 

5.9 

USSR 

15.6 

15.6 

20.7 

7.0 

W.  Germany 

5.2 

M 

6.3 

1.8 

E.  Germany 

0.8 

2.2 

0.5 

UK 

4.8 

13.6* 

6.7 

1.6 

France 

4.5 

6.3 

4.5 

1.4 

Japan 

3.6 

7.8 

7.3 

2.9 

Italy 

2.6 

3.4 

2.7 

15 

Canada 

2.2 

1.1 

2.0 

0.6 

India 

2.2 

1.8 

2.2 

14.4 

Poland 

1.6 

1.5 

2.9 

0.9 

Australia 

1.1 

0.5 

1.2 

0.3 

Romania 

1.0 

0.6 

0.9 

0.5 

Spain 

0.9 

0.2 

0.4 

1.0 

Sweden 

0.9 

0.7 

0.9 

0.2 

Netherlands 

9.9 

5.2* 

0.8 

0.4 

Belgium 

0.8 

.0.3 

0,6 

0.3 

Czechoslovakia 

0.7 

0,9 

1.6 

0.4 

Switzerland 

0.7 

1.0 

1.0 

0.2 

Hungary 

0.5 

0.5 

1.0 

0.3 

Austria 

0.4 

0.2 

0.5 

0.2 

Bulgaria 

0.4 

0.2 

0.5 

0.2 

(ii)  Smaller  participants  and  non-participants:— 

All  other  countries  15.8  0,8 

4.6 

57.5 

•  Note:  Data  known  to  be  swollen  because  of  one  or  more  large  international  journals 
published  from  this  nation. 
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any  apparatus  of  statistical  correlations  that  most  countries  show 
pretty  good  agreement  between  their  shares  of  wealth  and  those  of  \ 

the  pure  sciences.  The  United  Kingdom  and  the  Netherlands  have  ; 

distorted  figures  for  Physics  because  they  do  much  more  than  their  ■ 

share  of  publishing  great  and  important  international  journals  in  i 

this  field;  though  published  from  those  countries,  their  contributors  J 

are  from  elsewhere. 

Apart  from  this  the  contributions  to  science  of  Japan  and  the 
U.K.  are  both  rather  high,  or  to  look  at  the  reverse  of  that  coin, 
their  GNP’s  are  uncommonly  low  for  nations  that  scientific.  Spain,  ’ 

on  the  other  hand,  does  much  less  science  than  it  should.  Apart  from 
these,  the  shares,  to  this  order  of  accuracy,  show  excellent  agreement  ; 

with  expectation.  Outside  the  club  of  significant  scientific  nations 
there  stands  57%  of  the  world’s  population  with  nearly  16%  of  the 
wealth  but  only  less  than  5%  of  the  science— and  among  these 
countries  are  such  places  as  Denmark  which  is  highly  active  but 
too  small  to  show  on  our  table.  Missing  from  the  list  are  great  blocks 
such  as  Latin  America  that  should  have  3.7%  of  the  science,  China 
which  should  be  3.5%  of  the  world,  Africa  at  1.7%  and  the  Near 
East  which  should  publish  about  1%— these  alone  amounting  to 
10%  of  the  residual  16%  of  wealth. 

From  more  detailed  accounts  of  the  distribution  of  special  fields 
among  the  countries  of  the  world  we  come  to  the  conclusion  that, 
to  a  first  approximation,  all  the  pure  science  areas  are  distributed 
fairly  normally  as  we  have  seen  for  physics  and  chemistry.  There 
is  a  sort  of  equipartition;  if  a  country  has  1%  of  the  world’s  physics 
it  probably  has  1%  of  the  pure  mathematics  and  1%  of  the  bio¬ 
chemistry.  The  balance  between  the  sciences  seems  to  change  only 
slightly  from  country  to  country,  and  only  very  gradually  with  time 
within  any  one  country.  For  the  applied  sciences  it  seems  very  differ¬ 
ent;  highly  agricultural  nations  have  a  lot  of  agricultural  science  > 

and  those  without  agriculture  do  not.  It  is  similar  for  the  other 
applied  sciences  such  as  mining,  engineering  construction,  airplane  | 

manufacture,  etc.  The  applied  science  or  technology  activity  of  each  \ 

nation— whatever  one  calls  it— depends  on  very  much  more  than  the  f 

wealth  of  the  country.  Each  nation  seems  quite  rightly  to  make  its  f 

own  decisions  to  deploy  its  useful  labor  into  regions  that  are  5 

valuable  to  it  but  not  necessarily  for  other  lands.  Science  again  | 


.v-.K 


RESEARCH  ON  RESEARCH  15 

satisfies  an  international  constraint;  either  you  play  the  game  and 
conform  with  the  world  deployment  or  you  tend  not  to  play  at  all. 

The  implications  of  this  tentative  equality  between  scientific  out¬ 
put  and  economic  wealth  of  nations  is  considerable;  it  may  be  of  the 
greatest  importance  to  all  matters  of  general  science  policy.  On  the 
world  scale  there  are  published  each  year  about  700,000  scientific 
papers,  by  about  the  same  number  of  authors,  and  the  GNP  of  the 
whole  world  is  about  $2  x  1012,  There  must  be  therefore  about  three 
million  dollars  of  GNP  of  each  country  for  every  scientific  paper  in 
that  country,  or  about  the  same  for  each  author.  If  we  reckon  an 
average  salary  at  about  $10,000  per  annum  and  an  equal  amount  for 
the  overhead  in  plant  and  equipment,  secretaries  and  paper  clips, 
the  cost  of  the  scientists  comes  out  to  be  just  0.7%  of  the  GNP  of 
every  country  that  is  participating  in  modern  science.  An  alterna¬ 
tive  route  to  the  same  result  is  to  take  the  agreed  and  standard 
figure  for  the  cost  of  research  index— at  about  $20,000  per  published 
paper.  It  should  be  noted  however  that  both  computations  and  the 
final  result  of  0.7%  of  GNP  refer  only  to  the  present  situation.  The 
change  with  time  may  be  quite  rapid.  Both  scientific  salaries  and 
the  GNP  (or  personal  income)  of  countries  tend  to  rise  together  at 
a  rate  that  is  typically  about  4%  per  annum.  The  amount  of  scien¬ 
tific  work,  in  terms  of  numbers  of  papers  and  of  scientists  is  rising 
much  more  rapidly,  typically  with  a  doubling  every  ten  years.  It 
follows  then  that  simply  to  maintain  the  status  quo,  each  country 
must  double  the  percentage  spent  on  pure  science  from  its  GNP  in 
every  decade.  Though  we  have  a  figure  of  0.7%  now,  it  was  only 
0.3%  in  1956  and  it  will  be  1.5%  in  1976  and  12%  in  2006,  assum¬ 
ing  similar  conditions  to  hold. 

Such  conditions  cannot  of  course  hold  indefinitely,  and  indeed 
they  begin  to  break  down  quite  rapidly  because  in  the  most  de¬ 
veloped  countries  the  0.7%  we  have  just  computed  is  only  a  small 
fraction— about  one  fifth  r  one  quarter  of  the  total  expenditure  on 
what  is  traditionally  kne  vn  as  R  8c  D,  research  and  development. 
Since  we  have  already  computed  this  0.7%  as  the  amount  needed 
(give  or  take  a  factor  of  two  perhaps)  to  support  the  pure  sciences 
(and  perhaps  any  “pure  technologies”),  it  follows  that  the  other 
two  thirds  or  three  quarters  of  the  national  expenditure  must  be 
going  to  buy  technology,  presumably  to  buy  that  sort  of  technology 
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that  can  be  billed  as  “research”  with  all  the  aura  of  science  rather 
than  simple  manufacture.  If  we  were  to  spend  12%  of  the  GNP  on 
pure  sciences,  we  should,  to  preserve  the  same  balance  be  spending 
a  total  of  half  to  two  thirds  of  the  GNP  on  R  Sc  D— an  almost  in¬ 
conceivable  amount. 

At  present  the  total  USA  expenditure  on  R  and  D  is  of  the 
order  of  3.5%  of  the  GNP,  and  it  was  1.4%  a  decade  ago.  The 
figures  are  thus  running  about  five  times  what  we  have  computed 
for  the  pure  component  of  universal  scientific  knowledge,  the  paper- 
producing  industry  alone.  There  is  indeed  good  reason  why  the 
maximum  deployment  of  scientists  outside  the  knowledge  industry 
should  be,  at  most,  of  order  of  magnitude  three  or  four  times  those 
whom  we  have  counted  as  pure  scientists.  Traditionally  in  the 
sciences  only  about  20%  of  the  Ph.D.  graduates  are  recycled  so  as 
to  become  college  teachers.  In  the  humanities  the  conventional 
figure  is  around  80  or  90%  so  that  there  is  in  those  fields  very  little 
spare  output.  The  chief  service  to  the  community  being  performed 
by  humanists  is  the  education  of  the  young  at  the  undergraduate 
level;  at  the  graduate  level  the  principal  activity  is  reproducing 
their  own  kind,  teaching  students  to  become  teachers  to  train  more 
students  in  an  endless  cycle.  In  the  sciences  the  surplus  exceeds  the 
needs  of  reproduction  by  a  factor  of  four. 

One  rough  way  of  accounting  for  this  is  to  make  use  of  the  known 
rate  of  exponential  growth  to  determine  the  needs  in  self-reproduc¬ 
tion.  For  every  active  100  scientists  it  is  necessary  to  train  about 
seven  new  ones  each  year;  assuming  that  this  is  about  a  four  year 
process  there  will  therefore  be  about  30  students  in  residence  at 
any  time,  and  considering  the  national  average  it  will  take  a  supply 
of  about  20  teachers  at  the  most  to  care  for  them.  This  leaves  some 
80  active  scientists  who  need  not  be  teachers,  and  the  ratio  is  about 
4  to  1  as  expected. 

Of  course,  the  ratio  is  only  valid  if  the  active  scientists  are  in  the 
same  sort  of  field  of  science  whether  they  teach  or  serve  the  com¬ 
munity  by  delivering  some  other  service  or  product.  Thus  it  is 
possible  to  use  in  physics  industries  about  four  times  the  man¬ 
power  of  physics  teachers,  and  in  chemistry  a  similar  proportion, 
etc.  One  cannot  just  decide  to  have  a  very  large  deployment  of 
labor  in,  let  us  say,  agricultural  researches,  for  the  manpower 
available  is  only  four  times  that  of  the  pure  science  in  the  related 
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areas  of  biology,  and  the  amount  of  biological  research,  like  those 
of  the  other  sciences,  is  a  fixed  amount  in  equilibrium  each  with 
the  other  and  all  with  the  wealth  of  the  country. 

It  would  seem  that  in  the  smaller  and  less  developed  countries, 
the  current  total  expenditure  on  science  can  be  the  calculated  1% 
as  a  minimum.  Any  expenditure  beyond  that,  up  to  a  possibility 
of  about  a  factor  of  four  beyond  that,  implies  the  support  of  the 
associated  technologies.  It  is,  of  course,  a  vital  question,  but  one  on 
which  there  has  not  even  been  speculation  to  determine  whether 
the  wealth  of  countries  is  due  to  their  attention  to  this  deployment 
in  technology,  or  whether  only  the  amount  supported  is  due  to  the 
wealth.  At  all  events  we  know  now  that  there  tends  to  be  a  good 
correlation,  sc  it  is  reasonable  to  suppose  as  a  matter  of  policy  that 
each  country  should  spend  about  0.7%  of  its  GNP  on  the  support  of 
those  whose  business  it  is  to  engage  in  the  knowledge  industry  and 
publish  scientific  papers.  They  may  then  spend  up  to  four  times 
as  much  in  any  sectors  they  choose  where  the  trained  manpower 
can  be  deployed  in  technologies  useful  to  the  particular  needs  and 
industries  of  the  country  concerned. 

As  a  further  trial  of  this  important  new  principle,  I  present  here, 
also  for  the  first  time,  a  set  of  calculations  based  on  the  wealth 
(measured  in  total  personal  income)  and  the  scientific  manpower 
in  various  fields  (taken  from  the  National  Register  for  1964  and 
1966  of  all  of  the  separate  states  of  the  USA).  In  Table  II  we  show 
a  computer  printout  result  which  sets  the  states  in  order  of  the 
share  of  manpower  per  unit  share  of  wealth.  According  to  the 
theory  a  “normal"  state  should  have  this  ratio  about  equal  to 
unity,  and  indeed  15  of  the  51  states  come  within  10%  of  unity, 
and  a  total  of  34  are  within  25%  of  unity— this  is  very  good  agree¬ 
ment,  given  the  random  noise  expected  in  this.  Only  eight  states 
have  abnormally  high  ratios,  and  for  each  of  them  there  seems  a 
good  explanation  why  there  should  be  a  much  larger  scientific 
population  than  one  might  expect.  The  District  of  Columbia  is 
obviously  artificial  in  its  structure,  Delaware  has  several  well-known 
large  chemical  companies,  New  Mexico  and  Massachusetts  are  known 
to  have  abnormally  large  holdings  in  physics,  Maryland  has  the 
National  Institutes  of  Health,  etc.  At  the  other  end  of  the  scale 
arc  the  educationally  depressed  states  such  as  Arkansas,  Mississippi, 
Kentucky  and  Georgia;  this  calculation  shows  that  they  have  about 
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TABLE  11 

i  Sc,  mam 

>ower 

,  Personal 

TncT  state 

Fields  Overconcentrated 

Fields  Underconcentrated 

5,61 

D.C. 

— economics  fstatistics 

— chemistry 

3,22 

Delaware 

— chemistry 

15  2.02 

New  Mexico 

♦  physics 

§  1.75 

Wyoming 

f  agriculture  dearth  sc. 

g*  1.70 

Colorado 

f  agriculture  |  earth  sc. 
f  meteorology 

— chemistry 

a  L6° 

Maryland 

—biology 

a  1.50 

Utah 

—agriculture 

1.40 

Massachusetts 

—physics 

— agriculture  — earth  sc. 

1.25 

Alaska 

f  agriculture— earth  sc. 

f  meteorology 

1.25 

Oklahoma 

— earth  sc. 

1.23 

Idaho 

^agriculture 

1.22 

Montana 

f  agriculture  dearth  sc. 

1.21 

New  Jersey 

—chemistry 

— agriculture  — earth  sc. 

— biology 

1.14 

Arizona 

^agriculture 

1.09 

Washington 

fngriculture 

1.07 

Louisiana 

fcarth  sc. 

1.03 

Oregon 

f  agriculture 

1.01 

California 

j,  physics  ^mathematics 

—chemistry 

1.00 

Texas 

—earth  sc. 

1.00 

Vermont 

,  0.99 

Connecticut 

0.99 

New  York 

-psychology  ^economics  f  agriculture  —earth  sc. 

0.97 

Tennessee 

0.97 

Minnesota 

0.96 

Pennsylvania 

^chemistry 

-earth  sc. 

0.94 

New  Hampshire 

0.92 

0.90 

Hawaii 

Virginia 

t  meteorology 

T  mathematics 

0.90 

West  Virginia 

icrtemlstry 

.  0.85 

Wisconsin 

0.83 

Rhode  Island 

Ohio 

—chemistry 

o.so 

Illinois 

— chemistry 

— earth  sc. 

0.00 

Ksnsas 

0.79 

North  Dskcia 

4>agrkultUii. 

0.79 

Missouri 

0.79 

Iowa 

0.79 

Indiana 

0.78 

Nevada 

f  agriculture 

0.78 

South  Dakota 

—agriculture 

0.76 

Michigan 

0.76 

North  Carolina 

i  biology 

•  -si  Ci  r<“" 
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TABLE  II  {continued) 

jj>  Sc.  manpower 

%  Personal  inr.  Fields  Overconccntrated  Fields  Underconcentrated 
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V)  Jj 

l| 

0.69 

0.69 

Nebraska 

Maine 

f  agriculture 

0.6? 

Alabama 

— agriculture 

0.04 

Florida 

— meteorology 

el 

0.61 

South  Carolina 

—agriculture 

fi  c 

0.61 

Georgia 

fagriculture 

t  J 

0.60 

Kentucky 

1  earth  sc.  — agriculture 

0.S7 

Mississippi 

0.53 

Arkansas 

—agriculture 

t maldistribution  increases 
— maldistribution  constant  L  from  1964  to  1966. 

^maldistribution  decreases  j 

one  half  of  the  scientific  activity  they  deserve  and  need  for  education 
and  development.  For  more  detail  in  this  table  we  have  taken  each 
field  in  turn  and  measured  the  amount  (a  Chi  value)  by  which  it 
deviates  from  what  one  might  expect  if  each  science  was  distributed 
in  the  same  proportions  as  the  total  scientific  manpower.  Only  the 
largest  and  most  significant  deviations  have  been  indicated,  but  as 
a  matter  of  interest  we  have  computed  figures  for  both  1964  and 
1966  and  indicated  on  the  table  whenever  there  has  been  a  consid¬ 
erable  shift  with  time. 

To  set  the  same  information  in  another  way  we  have  taken  a 
field  presentation  in  Table  111;  this  shews  immediately  that  it  is 

TABLE  ill 

WHERE  THE  SCIENTIFIC  ACTION  IS 
Large  positive  Chi  values  of  (actual— expected)  figures  for  scientific  manpower. 

Source:  National  Register  I960. 
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Oregon 

54.3 

Texas 

83.4 

Foreign 

43.1 

New  Jersey 

44.0 

Idaho 

37.4 

Louisiana 

54.4 

Guam 

28.6 

Delaware 

35.8 

Montana 

36.2 

Oklahoma 

56.8 

Alaska 

19.8 

Ohio 

24.4 

'ashingtou 

24.5 

Colorado 

40.7 

Hawaii 

15.4 

Pennsylvania  19.9 

Alaska 

20.7 

Foreign 

35.2 

Florida 

1221 

W.  Virginia 

13.4 

Arkansas 

19.7 

Wyoming 

*83 

Colorado 

11.4 

Illinois 

11.7 

Georgia 

19.4 

Alaska 

15.1 

Wyoming 

19.1 

Montana 

11.6 

J>.  Dakota 

19.1 

Mississippi 

112} 

N.  Dakota 

13.7 
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S.  Carolina 

13.5 

Utah 

12.9 

Maine 

12.2 

Nevada 

10.9 

Alabama 

10.5 

Arizona 

10.1 

Colorado 

10.1 

All  Else 


District  of  Columbia 

Economics 

41.4 

New  Mexico 

Physics 

24.8 

District  of  Columbia 

Statistics 

24.4 

Maryland 

ltiology 

23.5 

Massachusetts 

Physics 

23.4 

California 

Mathematics 

22.8 

California 

Physics 

20.5 

New  York 

Psychology 

19.3 

Foreign 

T.inguistics 

18.8 

Foreign 

Anthropology 

14.3 

Virginia 

Mathematics 

13.7 

New  York 

Economics 

10.7 

N.  Carolina 

Biology 

10.7 

the  useful  technologies  that  vary  sharply  from  state  to  state  accord¬ 
ing  to  their  particular  industries  and  needs.  The  exact  sciences  have 
very  little  maldistribution  and  those  that  exist  all  seem  reasonable 
in  view  of  the  existence  of  social  large  laboratories  and  institutes. 

The  program  is  now  being  extended  to  take  other  computer 
methods  for  evaluating  the  statistical  distributions  of  science,  meas¬ 
ured  in  many  different  ways,  among  the  states  and  among  the  na¬ 
tions  of  the  world.  It  might  not  tell  one  exactly  what  to  do  in  science 
policy  or  how  to  decide  whether  a  particular  state  should  spend  a 
certain  amount  of  money  on  this  or  that  activity,  but  it  does  repre¬ 
sent  a  most  useful  monitoring  system  that  can  alert  one  to  uninten¬ 
tional  and  >  tridental  overconcentrations  and  undemmeentratiom. 
Our  Iv.  oes  are  that  as  these  and  the  other  quantitative  models  de¬ 
velop.  we  shall  more  and  more  come  to  know  just  how  science  works 
iu  such  a  way  that  we  can  deploy  our  limited  resources  to  the  best 
advantage. 
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II.  Psycholinguistic  Approaches 
to  the  Study  of  Communication 

George  A.  Miller 

For  many  years  psychologists  and  communication  engineers  have 
collaborated  to  test  and  improve  the  quality  of  voice  communica¬ 
tion  equipment.  In  this  collaboration,  the  psychologist’s  major 
interest  has  frequently  been  in  auditory  perception,  and  his  con¬ 
tribution  to  the  team  has  generally  been  to  determine  which  per¬ 
ceptual  aspects  of  a  voice  signal  must  be  faithfully  transmitted  if 
the  message  is  to  be  intelligible  to  a  receiver.  Today  this  minor 
branch  of  applied  science  is  reasonably  well  understood;  if  I  wished 
to  emphasize  past  accomplishments,  I  could  review  with  some  pride 
what  we  now  know  about  the  acoustic  nature  and  psychological 
perception  of  speech. 

It  is  more  challenging,  however,  to  tackle  problems  that  are  still 
unsolved.  There,  of  course,  is  where  our  research  attention  must  be 
focussed,  and  it  is  always  mere  interesting  to  talk  about  what  you 
are  still  puzzling  over.  So  I  intend  here  to  discuss  questions  about 
some  psychological  matters  that  I  believe  are  critically  important 
for  out  understanding  of  the  communication  process,  but  whose 
answers  are  not  yet  completely  clear. 

Although  we  now  understand  a  great  deal  about  the  communi¬ 
cation  of  signals,  the  communication  of  meaning  remains  some¬ 
thing  of  a  mystery.  When  you  begin  to  probe  this  mystery,  you 
encounter  immediately  such  enormously  complicated  and  improb¬ 
able  symbolic  systems  as  grammar,  dictionaries,  referential  relations, 
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logic,  the  human  mind.  One  of  the  first  tasks  for  a  psychologist, 
therefore,  is  to  establish  some  frame  of  reference  within  which  all 
these  diverse  and  complex  systems  can  live  together  in  peace  and 
harmony.  The  engineering  approach  to  the  study  of  communication, 
which  has  been  so  successful  in  characterizing  the  transmission  of 
signals,  must  be  replaced,  or  at  least  supplemented,  by  a  psycho- 
linguistic  approach.  I  would  like  first  to  discuss  some  of  the  more 
general  aspects  of  this  new  approach,  then  to  consider  particular 
examples  of  recent  psychological  research  on  syntax  and  semantics 
that  has  been  conducted  under  this  general  conception. 

SOME  FUNDAMENTAL  ASSUMP  TIONS 

Rene  Descartes  in  the  17th  century  formulated  modern  Western 
European  psychology  in  terms  of  a  dichotomy  between  corporeal 
body  and  incorporeal  soul,  thus  setting  a  trap  for  any  subsequent 
thinker  who  believed  psychology  is,  or  could  become,  a  rigorous 
science.  The  usual  methods  of  natural  science  do  not  obviously 
apply  to  this  invisible,  intangible,  nonextended  soul-stuff;  it  is  not 
necessary  to  apply  them,  Cartesians  argued,  since  knowledge  of 
one’s  soul  is  given  to  each  man  directly  by  intuition  and  need  not 
be  interred  by  inductive  logic  from  scientific  experiment. 

I  mention  these  somewhat  creaky  philosophical  opinions  neither 
to  support  nor  refute  them,  but  rather  because  the  evidence  on 
which  Descartes  based  his  dichotomy  provides  all-appropriate  con¬ 
text.  for  the  remarks  1  do  want  to  make. 

Descartes  said  that  men  differ  from  animals  because  only  roan  has 
a  soul;  animals  do  not  have  souls.  Animals  and  men  both  have 
bodies,  of  course,  but  bodies  are  mere  machines.  An  animal  differs 
from  man  because  the  animal’s  body  does  not  interact  with  an 
immortal,  incorporeal  soul.  It  was  the  possession  of  a  soul  that,  for 
Cartesians,  set  man  off  from  all  other  living  creatures.  Anyone  who 
has  grown  up  in  a  post-Darwihian  world  is  bound  to  question  this 
Cartesian  conclusion,  since  evolutionary  theory  makes  man  a  full- 
fledged  member  in  good  standing  of  the  animal  kingdom;  there  is 
no  biological  basis  for  any  sharp  dichotomy.  Yet  the  evidence  on 
which  Descartes’s  argument  rested  was  quite  clear,  and  just  as  valid 
now  as  then. 

Men  have  souls  and  animals  do  not,  according  to  Descartes,  be- 
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cause  men  have  language  and  animals  do  not.  As  the  Cartesians 
interpreted  this  fact,  language  is  so  subtle,  complicated,  and  influ¬ 
ential  that  it  is  inconceivable  that  any  mere  machine  could  under¬ 
stand  or  generate  it  as  humans  do.  Such  an  important  and  uniquely 
human  process  as  speech  could  only  result  from  the  possession  by 
man  of  some  novel  power,  some  special  essence— in  short,  from  a 
soul. 

The  evidence  Descartes  used  is  still  acceptable,  although  the 
argument  he  based  it  on  is  not.  All  men  in  all  societies  everywhere 
in  the  world  speak  one  or  more  varieties  of  human  language,  and 
man  is  the  only  creature  who  does  so.  Other  animals  have  signalling 
systems,  other  animals  communicate  in  many  interesting  ways,  but 
there  is  something  unique  about  the  human  animal’s  communica¬ 
tion  system  that  sets  it  apart  from  all  the  others— something  that 
both  cloaks  and  molds  the  human  animal’s  unprecedented  intelli¬ 
gence.  In  studying  language,  therefore,  we  come  close  to  that  which 
is  most  essentially  human  about  human  beings.  Nevertheless,  there 
are  few  modern  scientists  who  would  interpret  this  unique  method 
of  communication  as  an  argument  for  a  Cartesian  soul.  Our  posses¬ 
sion  of  language  is  a  consequence  of  an  evolutionary  process— an 
unprecedented  but  not  unnatural  event  that  occurred  only  in  the 
evolution  of  man. 

The  Cartesian  argument  that  language  is  too  complicated  for  any 
mere  machine  must  be  regarded  today  as  a  historical  commentary  on 
the  conception  of  a  machine  that  was  available  to  Descartes.  Today 
we  know  many  ways  to  process  natural  languages  by  machine,  and 
the  emergence  of  modern  digital  computers  has  enormously  ex¬ 
panded  our  conception  of  what  machines  can  be  and  do.  True,  there 
is  still  no  machine  that  deals  with  language  precisely  as  we  do,  but 
at  least  the  possibility  of  such  a  machine  is  no  longer  inconceivable. 
So  we  reject  this  argument  for  the  Cartesian  soul;  we  have  no  need 
for  what  Gilb  rt  Ryle  once  called  “the  ghost  in  the  machine.” 

Still,  there  is  a  fascinating  problem  here.  Descartes  was  right  when 
he  saw  that  human  communication  is  qualitatively  different  from 
animal  communication.  For  psychologists,  this  difference  poses  a 
tantalizing  question.  How  can  it  be  characterized?  What  have  we  got 
that  animals  lack?  Is  the  difference  only  quantitative,  or  is  it  quali¬ 
tative?  Until  we  can  give  clear  and  satisfactory  answers,  we  cannot 
claim  to  understand  the  human  mind.  Some  progress  has  been  made 
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toward  the  answers,  I  think,  but  much  about  our  human  gift  of 
speech  still  remains  clouded  and  unclear. 

Predication.  Generations  of  philosophers  and  scientists  have 
struggled  with  these  questions;  I  will  not  review  their  history.  In 
the  main,  more  has  been  done  to  rephrase  the  questions  than  to 
answer  them. 

Perhaps  it  would  be  a  fair  summary  of  what  these  struggles  have 
accomplished  to  say  that  our  human  capacity  for  predication  lies 
close  to  the  heart  of  the  mystery.  Grammar  begins  with  the  sen¬ 
tence;  logic  begins  with  the  proposition;  both  grammar  and  logic 
take  predication  for  granted.  The  problem,  of  course,  is  to  character¬ 
ize  objectively  what  such  a  reformulation  could  mean.  A  biologist 
or  psychologist  should  not  take  predication  for  granted;  for  these 
sciences  predication  is  a  natural  phenomenon  demanding  an 
explanation. 

To  speak  of  predication  in  particular,  rather  than  of  speech  in 
general,  may  narrow  our  search,  but  does  not  provide  our  answers. 
The  sad  truth  is  that  we  are  still  unable  to  give  definitive  solutions 
for  these  old  and  important  puzzles.  But  I  want  to  argue  that  a 
-  _  scientific  formulation  of  them,  in  modern  psycholinguistic  terms,  at 
least  suggests  how  some  answers  might  be  found. 

Consider  this  question:  when  a  monkey  warns  the  rest  of  its 
troop  that  a  predator  is  near,  how  does  the  vocal  cry  differ  from  the 
English  sentence,  “I  see  a  hungry  lion  nearby”? 

The  human  sentence  says  both  more  and  less  than  the  animal 
cry.  We  feel  it  would  make  sense  to  ask  whether  the  human  sentence 
is  true  or  false,  whereas  it  would  never  occur  to  us  to  ask  this  of  a 
monkey’s  cry.  But  is  the  appropriateness  of  truth  or  falsity  an  im¬ 
portant  psychological  difference? 

Perhaps  the  warning  cry  should  be  translated  as  “run  for  your 
lives,”  which  even  in  English  is  not  something  we  would  weigh  for 
truth  or  falsity;  advisable  or  inadvisable,  perhaps,  but  not  true  or 
false.  Yet  even  this  less  specific  translation  seems  to  distort  what 
actually  occurred. 

As  Grace  de  Languna  pointed  out,  the  interpretation  of  an 
animal  cry  is  tied  to  its  context  in  a  way  that  the  interpretation  of 
a  sentence  is  not,  or  need  not  be.  The  context  in  which  a  sentence 
is  produced  and  the  context  in  which  it  is  received  can  be  made 
completely  arbitrary  and  independent  of  one  another;  such  freedom 
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is  not  available  for  most  animal  communication.  Perhaps  this  free¬ 
dom  is  the  major  psychological  consequence  of  having  propositional 
language. 

When  we  equate  a  monkey’s  cry  to  a  human  sentence  we  project 
human  psychology  in  a  most  imperialistic  way.  How  far  do  we  dare 
to  carry  anthropocentrism?  Can  we  go  so  far  as  to  translate  a 
monkey’s  silence  as  “I  do  not  see  a  hungry  lion  nearby”?  The  sug¬ 
gestion  is  absurd;  think  of  all  the  things  that  a  monkey’s  silence- 
means  it  is  not  seeing!  The  monkey’s  cries  and  silences  neither 
affirm  nor  deny  anything  about  a  lion— they  are  simply  not  predica¬ 
tions.  Something  essential— and  essentially  human— is  lacking  in  the 
cry,  but  unavoidable  in  sentence. 

If  we  accept  this  intuition  that  there  is  something  more  to  human 
speech  than  to  animal  cries,  we  should  try  to  specify  what  the  “some¬ 
thing  more’’  consists  in.  One  obvious  possibility  is  that  the  difference 
can  be  attributed  entirely  to  the  greater  combinatorial  productivity 
of  hUmarrianguageTSpeech  combines  elements  freely  into  an  un¬ 
limited  variety  of  significant  sequential  patterns.  Animal  cries  are, 
by  and  large,  relatively  stereotyped  and  invariant,  and  a  sequence 
of  them,  unlike  a  sentence,  is  little  more  than  a  list  of  vocal  responses. 

The  combinatorial  productivity  of  human  language  is  obviously 
important,  yet  in  and  of  itself  productivity  does  not  explain  the 
difference  between  men  and  animals.  Why  must  human  signals  be 
more  various?  What  special  human  need  does  this  combinatorial 
versatility  serve?  One  answer,  in  very  general  terms,  is  that  a  highly 
productive,  combinatorial  system  of  signals  can  free  communication 
from  the  context  of  the  immediate  environment  in  which  it  occurs. 
Human  language  is  characterized  by  sentences  that  combine  a  topic 
and  a  comment  on  that  topic;  that  is  what  we  mean  by  predication. 
In  its  most  primitive  form,  perhaps,  the  topic  or  the  comment  can 
be  supplied  by  gestures,  or  by  pointing  to  things  nearby.  But  ges¬ 
tures  are  tied  to  the  context  in  which  they  are  produced.  In  order 
to  gain  freedom  from  the  context  of  communication— to  provide 
vocal  substitutes  for  all  possible  gestures— a  great  variety  of  descrip¬ 
tive  signs  are  needed,  enough  signs  to  name  every  thing  or  aspect 
of  a  thing  about  which  some  comment  might  be  made. 

N  ominalization.  Yet  even  that  cannot  be  the  whole  story,  for 
human  language  is  far  more  productive  than  our  freedom  from  the 
communication  context  would  lead  us  to  expect.  The  productive 
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character  of  human  language  is  greatly  enhanced  by  the  fact  that 
one  sentence  can  become  the  comment,  or  the  topic  for  a  comment, 
in  a  second  sentence. 

This  recursion  may  sound  complicated,  but  it  is  not.  Consider 
an  example:  "Mary  sings”  is  a  sentence  formed  by  introducing 
Mary  as  a  topic  for  a  comment  about  her  vocalization;  in  “Mary’s 
singing  is  loud,”  the  sentence  “Mary  sings”  has  been  nominalized— 
made  into  a  name  in  order  to  serve  as  the  topic  of  another  sentence 
—and  combined  with  a  comment  about  auditory  magnitude.  The 
game  can  be  continued  “Mary’s  singing  is  loud”  can  be  nominalized 
and  made  the  topic  of  “It  surprised  John,”  as  in  the  sentence,  “The 
loudness  of  Mary’s  singing  surprised  John.”  And  then  “John’s 
surprise  at  the  loudness  of  Mary’s  singing  was  obvious,”  and  so  on 
and  on. 

“John’s  surprise  at  the  loudness  of  Mary's  singing”  is  not  a 
sentence;  it  is  a  name,  just  as  “John”  and  “Mary”  are  names.  By 
this  device  a  human  language  acquires  infinitely  more  names  than 
nouns.  It  is  an  interesting  psychological  question  to  ask  whether 
predications  and  nominalizations  are  understood  in  the  same  way; 
whether  “Mary  sings”  and  “the  singing  of  Mary”  are  cognitively 
different  in  any  way.  Combinatorially,  however,  the  point  is  that 
predication  by  itself  is  not  very  productive,  since  it  combines  topics 
and  comments  only  two  at  a  time,  but  taken  together  with  nom- 
inalization  it  becomes  infinitely  productive. 

What  logicians  generally  mean  by  predication  is  that  a  comment 
is  made  about  a  topic— is  affirmed  or  denied  of  it— in  such  a  manner 
that  any  person  who  understands  the  predication  will  recognize  the 
conditions  under  which  it  would  be  true  or  false.  If  we  were  to 
accept  this  relation  as  fundamental  for  linguistics  as  well  as  for 
logic,  then  rules  of  grammar  in  any  particular  language  would  be 
viewed  as  machinery  whereby:  (1)  predications  were  embodied  in 
pronounceable  sentences;  and  (2)  predications  were  nominalized 
to  serve  as  constituents  in  more  complex  predications. 

There  have  been  objections  to  this  conception  of  grammar.  For 
instance,  some  sentences— imperatives  and  interrogatives— are  not 
propositions,  so  grammar  would  seem  to  have  more  machinery  than 
the  bare  minimum  necessary  for  predication.  Yet  all  these  non- 
propositional  sentences  are,  more  or  less  directly,  derivable  from 
predicative  origins.  "Close  the  door"  is  neither  true  nor  false,  yet 
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it  is  grammatically  kin  to  “You  will  close  the  door,”  which  can  be 
true  or  false.  Similarly,  “Who  closed  the  door?”  is  not  a  proposition, 
yet  it  is  grammatically  related  to  “Someone  closed  the  door,”  which 
is.  At  a  deeper  level,  therefore,  the  predicative  structure  of  such 
sentences  is  still  understood  by  any  person  who  knows  English. 

It  is  this  predicative  aspect  of  language  that  is  unique  to  human 
communication  and  not  found  in  other  animals.  Most  of  the  com¬ 
plex  grammatical  machinery  of  human  language  is  entailed  by  the 
need  to  actualize  this  predicative  relation  in  pronounceable  form. 

In  order  to  describe  this  uniquely  human  mode  of  communica¬ 
tion,  therefore,  we  must  deal  on  the  one  hand  with  rules  that 
govern  the  formation  and  transformation  of  predicated  relations 
into  sentences,  and  on  the  other  hand  with  rules  that  govern  the 
meaningful  interpretation  of  words  and  sentences.  In  philosophical 
terms,  we  must  deal  with  syntax  and  semantics.  In  linguistic  terms, 
we  must  deal  with  grammar  and  lexicon.  These  two  central  aspects, 
therefore,  provide  an  organization  for  the  psycholinguistic  research 
I  want  to  describe.* 

I  hope  it  is  obvious  that  I  intend  to  raise  more  questions  than  I 
answer;  do  not  expect  from  me  any  revelation  of  the  ultimate  source 
of  the  mysterious  power  language  gives.  The  most  I  can  offer  is  a 
new  way  to  ask  some  old  questions.  And  perhaps  I  will  cause  you 
to  think. 

SYNTAX 

Let  me  begin  with  syntax.  I  would  discuss  it,  not  as  a  philosopher 
or  linguist,  but  as  a  psychologist  concerned  to  understand  the 
cognitive  processes  whereby  native  speakers  of  a  language  conform 
so  accurately  and  unconsciously  to  the  intricate  patterns  described 
by  grammarians  (psyntax?).  In  order  to  give  a  complete  account  of 
our  syntactic  skills,  of  course,  we  would  need  explicit  and  detailed 

*  A  full  discussion  would  treat  at  least  three  aspects  of  language,  since  phon¬ 
ology  could  not  be  ignored  in  any  comprehensive  discussion  of  spoken  communi¬ 
cation.  By  avoiding  it  here  I  do  not  wish  to  give  a  false  impression  that  all 
problems  of  phonetics  and  phonemics,  or  articulation  and  perception  of  speech, 
of  electroacoustic  transduction  and  transmission  of  speech  waves  have  been 
solved.  But  it  is  certainly  true  that  those  problems  are  better  understood  than 
are  the  syntactic  and  semantic  aspects  which  play  such  a  crucial  role  in  the 
communication  of  meaning. 
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information  about  all  of  the  known  grammatical  regularities.  Even 
if  the  necessary  grammatical  knowledge  were  at  my  fingertips  (and 
it  is  not),  this  would  not  be  the  time  or  place  to  display  it.  However, 
a  bare  minimum  of  grammatical  knowledge  is  necessary  if  I  am  to 
discuss  the  topic  at  all.  So  let  me  quickly  illustrate  the  sort  of  gram¬ 
matical  knowledge  that  all  of  us,  as  speakers  of  English,  must  share, 
at  least  implicitly,  if  not  explicitly. 

Constituent  structure.  Take  a  simple,  declarative  sentence,  “Bill 
hit  the  ball.”  It  contains  four  words,  and  their  order  is  important. 
Rearrange  the  order  of  these  words  and  you  get  something  very 
different,  e.g.,  “The  ball  hit  Bill”  is  a  sentence  with  a  different 
meaning,  and  "Ball  Bill  the  hit”  is  no  sentence  at  all.  One  thing  a 
grammar  of  English  should  tell  us,  therefore,  is  why  some  orderings 
of  words  make  admissible  sentences  and  other  orderings  do  not. 

One  approach  to  this  aspect  of  language  is  to  assume  that  mes¬ 
sages  are  generated  “from  left  to  right,”  one  word  at  a  time,  and 
that  each  successive  word  is  chosen  in  the  context  of  the  preceding 
words.  This  conception  of  a  message  source  (as  a  Markov  process) 
has  been  widely  used  in  information  theory,  and  has  many  ad¬ 
vantages  for  the  statistical  analysis  of  speech.  Unfortunately  for 
linguists  and  psychologists,  admissible  linguistic  patterns  are  more 
complicated;  a  Markov  process  does  not  provide  a  valid  charac¬ 
terization  of  the  grammatical  or  cognitive  structure  of  our  sentences. 

Consider  once  again,  “Bill  hit  the  ball.”  The  words  seem  to  go 
together  in  groups;  “the  ball”  is  a  natural  group  and  “hit  the”  is 
not.  Linguists  describe  such  grouping  in  terms  of  constituent 
analysis.  For  example,  “the  bal'”  is  a  constituent  of  the  sentence;  we 
can  replace  “the  ball”  by  “it”  and  still  have  roughly  the  same  gram¬ 
matical  structure.  However,  “hit  the”  cannot  be  replaced  by  any 
single  word  without  completely  changing  the  structure  of  the  sen¬ 
tence,  and  so  “hit  the”  is  not  a  constituent. 

If  we  proceed  in  this  way,  we  get  the  constituent  analysis  pre¬ 
sented  in  Fig.  1,  where  constituent  substitutions  are  shown  above 
and  their  grammatical  names  are  abbreviated  below.  This  simple 
sentence  has  two  constituents,  a  noun  phrase  (“Bill”)  and  a  verb 
phrase  (“hit  the  ball”).  The  verb  phrase  likewise  has  two  constitu¬ 
ents,  a  verb  (“hit”)  and  a  noun  phrase  (“the  ball”).  And,  finally,  the 
noun  phrase  has  two  constituents,  an  article  (“the”)  and  a  noun 
(“ball”). 
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Figure  1.  Example  of  the  analysis  of  a  sentence  into  its  grammatical  con¬ 
stituents. 

It  should  be  clear  that  the  constituent  structure  of  a  sentence  is 
hierarchical  in  nature,  a  fact  that  is  made  explicit  in  Fig.  2,  where  a 
tree  graph  gives  an  easily  visualized  summary  of  th-j  analysis.  More¬ 
over,  Fig.  2  also  indicates  how  this  phrase  structure  might  be 
characterized  as  a  consequence  of  the  grammatical  rules  of  English. 
Noam  Chomsky,  linguist  and  philosopher  at  the  Massachusetts  In¬ 
stitute  of  Technology,  has  adapted  “rewriting  rules”  from  formal 
logic  in  order  to  show  how,  by  following  explicit  rules,  sentences  of 
the  appropriate  structure  might  be  derived,  just  as  theorems  in 
logic  are  derived  from  a  basic  axiom  by  applying  rules  of  deduction. 
We  begin  with  the  axiom  S,  to  which  we  apply  rewriting  rule  FI 
to  obtain  NP  -j-  VP)  then  F3  applied  to  VP  gives  us  NP  -f  V  -f  NP ; 
etc.,  until  eventually  the  string  is  rewritten  as  "Bill  -f-  hit  +  the  -|- 
ball."  Since  the  grammar  does  not  contain  rules  for  rewriting  these 
symbols,  at  this  point  we  have  produced  what  is  called  a  "terminal 
string’’— i.c.,  a  sentence.  Thus  rules  FI -7  comprise  an  example  of 
what  Chomsky  calls  a  generative  grammar. 

Several  warnings  must  be  issued  immediately.  Rules  Fl-7  are  at 
best  only  a  tiny  fragment  of  English  grammar;  they  generate  a  few 
other  sentences,  but  nothing  like  the  full  range  of  English.  For  that 
we  would  need  an  enormously  enlarged  grammar,  including  some 
kinds  of  rules  more  powerful  than  any  of  those  illustrated.  Many 
essential  facts  of  grammar  have  here  been  deliberately  suppressed  in 
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Fi.  S  — ►  NP  +  VP 
F2.  NP — ►  T  +  N 
F3.  VP — ►  V+NP 
F4.  NP — ►Bill,  John 
F5.  T  — ►the, a 
F6.  N  — ►  boy,  girl,  ball 
F7.  V  — ►  hit 

Figure  2.  A  fragment  of  the  rules  for  a  generative  grammar  of  English,  and 
a  tree  graph  to  represent  the  phrase  structure  of  one  of  the  sentences  that 
the  grammar  will  generate. 

order  to  make  the  example  as  intelligible  as  possible.  A  fuller  treat¬ 
ment  of  English  grammar  would  have  to  add  extensively  to  this 
beginning.  In  particular,  we  would  need  transformational  rules; 
but  more  about  them  later. 

And— a  second  warning-the  characterization  of  a  generative 
grammar  in  terms  of  a  deductive  system  with  rewriting  rules,  as 
suggested  in  Fig.  2,  is  a  formal  convenience  for  the  grammarian  (and 
highly  suggestive  for  those  who  write  computer  programs),  but  it 
may  or  may  not  bear  any  explicit  resemblance  to  what  goes  on  "in 
our  heads"  when  we  produce  or  interpret  sentences.  The  same  struc¬ 
tural  relations  in  the  sentence  can  be  characterized  in  several  alter¬ 
native  ways,  some  of  which,  though  less  elegant  formally,  may  be 
more  realistic  psychologically. 

Psychological  validity  of  constituent  structure.  Regardless  of  what 
formal  or  analytic  notation  we  use  to  represent  it,  however,  the 
knowledge  that  is  represented  by  such  rules  must  somehow  be  avail¬ 
able  to  people  who  speak  and  understand  English.  Their  possession 
of  this  knowledge  is  an  empirical  fact  that  can  be  subjected  to  test. 
The  psychological  validity  of  constituent  structure  analysis  can  be 
demonstrated  in  the  psychological  laboratory  in  a  variety  of  ways. 
To  illustrate  this  kind  of  research,  I  will  describe  just  one  particular 
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type  of  experiment;  it  is  but  one  representative  of  a  variety  of  dem¬ 
onstrations  of  constituent  effects. 

It  is  a  general  principle  of  perceptual  organization  that  irrelevant 
stimulation  tends  to  interfere  minimally  with  our  perception  of 
structural  wholes.  Garrett  provided  an  auditory  example  of  this 
principle  when  he  demonstrated  that  if  a  string  of  digits  is  pro¬ 
nounced  with  a  noticeable  pause  between  some  particular  pair  (thus 
organizing  the  string  perceptually  into  two  substrings),  a  short,  ir¬ 
relevant  noise  imposed  on  the  spoken  digits  will  tend  to  be  reported 
as  occurring  during  this  pause.  The  irrelevant  noise  is  perceived  in 
such  a  way  as  to.  interfere  minimally  with  the  perceptual  organiza¬ 
tion  of  the  spoken  digits. 

One  attractive  feature  of  Garrett’s  result  is  that  the  perceptual  dis¬ 
placement  of  irrelevant  clicks  can  be  used  as  an  indicator  of  sub¬ 
jective  organization  in  instances  where  acoustic  segmentation  is 
less  obvious— in  grammatical  sentences,  for  example.  Fodor  and 
Bever,  adapting  a  technique  introduced  by  Ladefoged  and  Broad- 
bent,  used  this  indicator  to  explore  the  perceptual  reality  of  seg¬ 
ments  identified  linguistically  as  immediate  constituents  of  a  sen¬ 
tence.  Their  results  were  consistent  with  the  hypothesis  that  gram¬ 
matical  constituents  are  the  functional  units  of  speech  perception. 

Consider  the  sentence,  “That  he  was  happy  was  evident  from  the 
way  he  smiled.”  The  surface  (or  constituent)  structure  is  dia¬ 
grammed  in  Fig.  3;  in  the  lower  half  of  this  figure  horizontal  lines 
indicate  the  extent  of  the  various  perceptual  units  on  the  hypoth¬ 
esis  that  these  units  are  grammatical  constituents.  From  Fig.  3  it 
can  be  seen  that  there  is  a  major  structural  break  in  this  sentence  be¬ 
tween  the  words  "happy”  and  "was.”  The  sentence  can  be  (and 
was)  spoken  in  such  a  way  as  to  leave  no  objectively  measurable 
acoustic  pause  between  tiiese  two  words,  but  even  so,  if  linguistic 
structure  is  a  controlling  factor,  the  sentence  should  be  perceived 
as  if  the  speaker  had  paused  at  this  major  constituent  break.  In  that 
use,  extraneous  clicks  should  be  judged  as  displaced  toward  this 
perceptual  boundary. 

Subjects  in  Fodor  and  Bever's  experiment  heard  recorded  sen¬ 
tences  in  otic  ear  and  a  loud  dick  in  the  opposite  ear.  The  time  of 
(lie  dick  could  be  varied,  as  indicated  in  Fig.  3,  to  coincide  with 
the  major  constituent  boundary,  to  i  recede  it,  or  to  follow  it.  The 
listener's  task  was  to  write  down  the  sc  stance  (this  task  forced  him  to 
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Figure  3.  Illustrating  the  dick  placements  used  in  the  experiment  by 
Fodor  and  Bevcr. 

pay  attention  to  the  whole  sentence)  and  then  to  mark  the  location 
of  the  click.  When  the  results  were  analyzed,  a  statistically  significant 
displacement  was  found  in  the  direction  predicted,  that  is,  toward 
the  major  syntactic  boundary.  When  the  click  preceded  the  major 
break,  there  was  a  tendency  to  report  that  it  had  occurred  later  than 
in  fact  it  actually  had;  conversely,  when  it  occurred  following  the 
major  break,  listeners  tended  to  report  that  it  had  occurred  earlier 
than  it  actually  had.  Which  is  consistent,  of  course,  with  the  hy¬ 
pothesis  that  grammatical  constituents  are  psychological  units  and 
that  an  interfering  click  was  shifted  perceptually  so  as  to  interrupt 
as  few  perceptual  units  as  possible. 

Although  eiforts  were  made  to  avoid  it,  it  is  possible,  of  course, 
that  the  perceived  groupings  reflected  acoustic  pauses  or  intonation, 
rather  than  die  constituent  structure  assumed  by  the  grammarian. 
In  order  to  control  for  this  possibility,  therefore,  Garrett.  Sever,  and 
Fodor  repeated  the  experiment  with  sentences  in  which  exactly  the 
same  acoustic  stimulus  was  provided  in  both  cases.  The  alternative 
segmentations  were  suggested,  not  by  physical  attribute  of  the 
speech  signal,  but  by  die  context  in  which  the  phrase  occurred.  For 
example,  take  a  tape  recording  of  die  phrase  “George  drove  luri- 


34  JOURNEYS  IN  SCIENCE 


ously  to  the  station”  and  synchronize  a  click  to  occur  with  "George.” 
Now  present  this  stimulus  to  listeners  in  the  context,  “In  order  to 
catch  his  train  George  drove  furiously  to  the  station”;  the  major 
constituent  boundary  is  between  "train”  and  "George.”  As  pre¬ 
dicted,  a  click  that  coincides  with  “George"  now  tends  to  be  re¬ 
ported  as  occurring  earlier,  at  the  ]>erceptual  boundary.  As  a  con¬ 
trol,  Garrett,  Bevcr,  and  Fodor  also  presented  this  same  acoustic 
stimulus  in  the  context  "The  reporters  assigned  to  George  drove 
furiously  to  the  station”;  the  major  constituent  boundary  now  fails 
between  "George”  and  “drove.”  Now,  as  predicted,  a  click  that 
coincides  with  George  is  reported  as  occurring  later,  rather  than 
earlier.  The  click  is  again  shifted  toward  the  constituent  boundary, 
but  now  in  the  opposite  direction— and  for  identical  acoustic  stim¬ 
ulation.  The  perceived  boundary  must  have  existed  in  tlte  mind  of 
the  listener,  nut  in  the  acoustic  stimulus. 

Such  studies  as  these  support  the  opinion  that  sentence  interpreta¬ 
tion  is  an  active  process;  a  listener  actively  imposes  a  structural 
analysis  ou  a  sentence,  rather  than  responding  more  or  less  pas¬ 
sively  to  some  acoustic  clues  that  mark  its  structure.  Moreover, 
within  the  limits  tested,  this  active  cognitive  process  seems  to  con¬ 
firm  the  grammatical  analysis  which  was.  of  course,  arrived  at  on 
other  and  very  different  grounds. 

There  arc  other  experiments  that  could  be  cited  to  confirm  this 
conclusion,  but,  since  most  people  are  inclined  to  accept  it  anyhow, 
I  will  not  belabor  the  point  that  constituent  structure  has  psycho¬ 
logical  validity.  From  a  psychological  point  of  view,  the  salient  fea¬ 
ture  of  constituent  structure  is  its  suitability  for  expressing  predica¬ 
tion;  its  effects  on  our  |terception  and  memory  for  sentences  is  only 
a  confirming  by-product. 

Constituent  analysis  of  sentences  into  a  noun  phrase  and  a  verb 
fthrase  fas  shown  in  both  Figs.  I  and  2)  serves  directly  to  represent 
the  cognitive  relation  between  a  topic  and  a  comment  on  uut 
topic,  which  is  the  essence  of  predication.  l!p  to  this  point,  there¬ 
fore,  we  can  conclude  that  the  linguistic  and  the  {mythological 
charaeteriiattons  are  compatible  (even  though  we  may  not  be  en¬ 
tirely  certain  what  the  psychological  implications  ate  of  expressing 
the  grammatical  characterisation  as  a  rewriting  rule).  Predication 
requires  a  two-part  construction,  and  English  grammar  provides  it 
handsomely. 
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Tar.sformational  rules.  However,  when  we  look  at  more  compli¬ 
cated  sentences— and  nearly  every  sentence  we  actually  use  is  more 
complicated  than  "Bill  hit  the  ball'-wc  find  it  necessary  to  intro¬ 
duce  what  Chomsky  has  called  transformational  rules. 

We  have  already  mentioned  nontinalization:  if  we  want  to  charac¬ 
terize  the  structure  of  “Bill's  hitting  of  the  ball  was  skillful,"  the 
simple  way  to  do  so  is  to  exploit  the  closi  relation  between  "Bill  hit 
the  ball"  and  “Bill’s  hitting  of  the  bail”;  indeed,  to  assume  that 
both  the  sentence  in  its  declarative  form  and  the  nominalization  of 
the  sentence  are  derived  from  essentially  the  same  underlying  struc¬ 
ture,  but  widt  slightly  different  transformations  applied  in  the  two 
cases.  In  a  similar  way,  "The  ball  was  hit  by  Bill,”  “Who  hit  the 
ball?”,  “Bill  didn't  hit  the  ball,”  etc.,  would  all  !>c  characterized 
as  deriving  from  the  same  deep  structure,  but  would  have  different 
surface  structures  because  different  transformational  rules  would 
have  l>ccn  applied  in  their  derivations. 

Precisely  how  transformational  grammars  sltould  be  formulated 
is  a  central  problem  tor  linguistic  theory;  one  that  is  currently  re¬ 
ceiving  much  attention  and  about  which  opinions  are  developing 
rapidly.  Rather  than  try  to  summarize  this  shifting  scene,  I  would 
prefer  to  consider  it  in  terms  of  the  performances  that  have  to  be 
explained,  rather  than  in  terms  of  some  mote  abstract  tiicory  about 
a  language  user's  underlying  competence. 

In  that  spirit,  therefore,  let  me  describe  a  psychological  inter¬ 
view.  The  interviewer  wears  two  hand  puppets.  On  hh  left 'hand  is 
The  Old  Woman,  on  hit  right.  Alligator.  He  is  speaking  to  a  young 
child. 

“The  Old  Woman  and  Alligator  are  going  to  talk,  to  each  other,” 
the  inte«  viewer  tells  the  child.  “First  The  Old  Woman  will  say 
something  and  then  Alligator  will  answer.  You  listen  closely  to 
what  Alligator  says,  because  in  a  minute  you  are  going  to  be  Alligator 
and  you  will  Imve  to  give  the  t  ight  answers  to  The  Old  Woman.”  Hie 
child  nods  solemnly,  aiul  the  psychologist  makes  the  puppets  talk,  as 
follows; 

\ 

TOW:  Johnny  ha  good  toy. 

A:  Isn't  he? 

TOW:  Hit  (needs  at*  good. 

A:  Aren't  they? 
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TOW:  Tomorrow  they  will  all  go  on  a  picnic. 

A:  Won’t  they? 

The  conversation  between  the  two  puppets  continues  in  this 
fashion  until  the  child  has  heard  several  instances  of  tag  questions.* 
He  is  then  given  an  opportunity  to  play  Alligator.  If  he  understands 
that  Alligator  always  responds  with  a  tag  question  derived  mechan¬ 
ically  from  The  Old  Woman’s  sentence,  and  if  he  has  the  gram¬ 
matical  competence  needed  to  make  tine  derivation,  a  satisfactory 
performance  can  be  expected  from  the  child.  Older  children  per¬ 
form  perfectly;  younger  children  make  mistakes.  In  this  way  (among 
others)  Professor  Roger  Brown  and  his  colleagues  at  Harvard 
University  have  been  exploring  successive  stages  in  the  development 
of  grammatical  competence  and  performance  in  English-speaking 
children. 

For  example,  to  move  directly  from  The  Old  Woman’s  sentence 
to  Alligator’s  tag  response  would  involve  several  operations.  Take, 
for  example,  the  pair;  “Johnny  is  a  good  boy”  and  “Isn’t  he?”  The 
operations  involved  here  would  be:  (1)  recognition  of  the  subject 
of  the  sentence,  in  this  case,  “Johnny";  (2)  pronominalization  of 
the  subject,  which  turns  “Johnny”  into  “he”;  (3)  recognition  of  the 
appropriate  verb,  in  this  case,  “is”;  (4)  negation  of  this  verb,  which 
turns  “is”  into  “isn’t”;  and,  finally,  (5)  inversion  of  the  order  of 

•  It  should  he  obvious  that  the  intetview  need  not  be  limited  to  tag  questions. 
There  are  many  types  of  sentence  pairs  that  can  be  used  in  such  a  test  of  a  child's 
knowledge.  For  example,  an  interviewer  can  use  active  and  passive  voice,  with 
one  puppet  saying,  "Johnny  ate  the  apple,”  and  the  other  replying,  "The  apple 
was  eaten  by  johnny,”  etc.  Or  he  can  use  affirmative  and  negative  forms: 
“Johnny  will  find  find  his  shoe”  Vi.  "Johnny  won’t  find  his  shoe,"  etc.  Or  various 
other  types  of  questions  can  be  used:  "Johnny  saw  the  fight”  can  be  paired  with 
"What  did  Johnny  sec?”  or  "Who  saw  the  fight?”  or  "Did  Johnny  see  the  fight?” 
Or  pairs  of  sentences  from  The  Old  Woman  can  be  combined  into  one  by 
Alligator:  "Johnny  came  home  and  Mary  came  home"  can  elicit  "Johnny  and 
Mary  came  home,"  or  “Johnny  heard  the  burglar  and  -johnny  called  his  father” 
can  elicit  "Johnny,  who  heard  the  burglar,  called  his  father";  or  "Mary  sings 
and  it  sounds  pretty”  can  elicit  "Mary's  singing  sounds  pretty,"  etc.,  each  pair  of 
sentences  testing  different  transformational  relations.  The  list  of  sentence  rela¬ 
tions  could  be  extended  considerably,  so  there  is  no  shortage  of  tasks  io  set. 

The  relations  between  pairs  of  sentences  just  illustrated  arc  all  grammatical. 
That  is  to  say,  any  adequate  grammar  of  English  should  include  an  explicit  ac> 
count  of  the  rules  for  generating  all  these  sentences,  and  from  those  rules  It 
should  be  possible  to  sec  precisely  what  tite  relations  between  the  sentences  arc 
in  each  instance. 
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subject  and  verb,  which  turns  “he  isn’t’’  into  “isn’t  he?”  Thus  we 
could  imagine  the  following  sequ  .'cc; 

Johnny  is  a  good  boy. 

Johnny  is. 

He  is. 

He  isn’t. 

Isn’t  he? 

Any  person  who  can  play  Alligator’s  role  and  supply  the  appro¬ 
priate  tag  questions  must  have  a  working  knowledge  of  these  gram¬ 
matical  relations,  and  of  the  way  they  go  together  to  produce  the 
appropriate  tag.  Recognizing  the  subject  of  a  sentence  may  demand 
a  sophisticated  analysis;  it  can  be  especially  tricky  when  the  subject 
is  not  explicitly  given,  as  in  imperatives:  “Close  the  door”  should 
elicit  “Won’t  you?”,  even  though  the  appropriate  subject  and  verbal 
auxiliary,  “You  will,"  are  both  missing  from  the  original  sentence. 

Correct  performance  with  tag  questions  also  entails  considerable 
ability  to  analyze  verb  constructions. 

Johnny  will  have  been  running.  Won’t  he? 

Johnny  ....  has  been  running.  Hasn’t  be? 


Johnny  .  is  running.  Isn’t  he? 

Johnny  .  ran.  Didn’t  he? 


The  verbal  component  used  in  a  tag  question  is  the  first  auxiliary 
verb,  unless— as  in  the  last  case— there  isn’t  any  auxiliary,  in  which 
case  the  convenient  verb  “do”  is  introduced  to  play  the  part  of  the 
missing  element. 

Even  negation  demands  some  syntactic  sophistication  from  a  suc¬ 
cessful  subject,  since  the  negative  must  be  added  when  it  is  missing 
and  removed  when  it  is  present: 

Johnny  has  run.  Hasn't  he? 

johnny  hasn’t  run.  Has  he? 

A  linguist  might  be  content  with  a  clear  and  correct  description 
of  the  formal  relations  among  such  sentences,  but  a  psychologist  who 
is  interested  in  how  people  perform  fbo  amazingly  skilled  acts  which 
underlie  even  the  simplest  sentences  would  like  to  know  more.  He 
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would  like  to  know,  tor  example,  what  it  is  that  a  person  does— step 
by  step-in  constructing  Alligator's  response  on  the  basis  of  The 
Old  Woman’s  sentence.  Does  he  first  recognize  the  subject,  then 
analyze  the  verbal  construction,  then  delete,  then  invert,  then  pro- 
nominalize,  then  negate?  Or  does  he  do  it  in  some  different  order? 
Or  does  he  first  interpret  the  proposition  underlying  The  Old 
Woman’s  sentence  and  then  use  that  abstract  conception  as  the 
input  for  generating  a  tag  question?  Or  is  it  not  a  serial  process  at 
all?  These  questions  are  not  easily  answered.  The  cognitive  processes 
whereby  a  tag  is  added  remain  stubbornly  unclarified,  even  though 
their  formal  consequences  are  well-known  and  easily  characterized. 

One  way  to  make  such  questions  definite  and,  thus,  potentially, 
answerable,  is  to  phrase  them  in  terms  of  a  computer  model.  As  a 
psychologist,  of  course,  I  have  no  vested  interest  in  making  com¬ 
puters  process  natural  languages,  but  I  do  have  a  real  interest  in 
understanding  well  enough  how  people  do  it  that  I  could  express 
that  understanding  in  the  rigorous  form  needed  to  support  compu¬ 
ter  simulation.  There  are  many  people'  who  would  like  computers 
to  treat  language  in  a  humanoid  manner;  there  seem  to  be  many 
potential  benefits  to  be  gained  from  having  such  computer  programs. 
My  interest  in  such  simulation,  however,  is  merely  to  test  my  own 
understanding  of  how  human  beings  perform  these  intricate  feats. 

Suppose,  therefore,  that  we  wanted  a  computer  to  play  Alligator’s 
role  in  conversation  with  The  Old  Woman.  What  program  of  in¬ 
structions  would  the  computer  need?  First,  we  would  have  to  tell 
the  computer  how  to  recognize  subjects  and  predicates  in  English 
sentences,  which  is  no  easy  matter  to  explain  even  to  students  who 
speak  English  as  their  native  tongue.  When  you  must  explain  it  to 
a  machine  that  cannot  tell  an  English  sentence  from  a  table  of 
random  numbers  you  must  be  very  precise  indeed. 

How  might  we  tell  a  computer  to  recognize  the  subject  of  an 
English  sentence?  We  might,  for  example,  give  the  computer  a  list 
of  nouns,  since  nouns  are  often  the  subjects  of  our  sentences.  If 
’  bus”  were  on  the  list  of  nouns,  for  example,  we  could  tell  the 
computer  that  when  it  was  the  first  noun  in  the  sentence,  to  treat 
"bus"  as  the  subject  and  to  substitute  "it"  for  "bus"  in  pronominal- 
izing.  This  would  work  for  such  sentences  as  “The  bus  has  left,"  and 
would  give  us  “The  it  has  left."  But  wait  a  moment.  What  is  "The" 
still  doing  in  the  pronominalized  sentence?  Obviously,  "it”  must  be 
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substituted  for  the  whole  noun  phrase,  not  just  for  the  noun  itself. 

For  example,  in  the  sentence,  “The  first  bus  has  left,1'  we  want  to 
substitute  “it”  for  “The  first  bus,”  and  so  obtain  “It  has  left.”  This 
is  a  bn  better.  But  now  what  do  we  tell  a  computer  to  do  about 
“The  bus  and  the  taxi  have  left”?  Should  this  be  converted  into  “It 
and  the  taxi  have  left”?  Probably  not.  Somehow  the  computer  must 
be  made  to  understand  that  English  can  have  compound  phrases  as 
subjects  of  grammatical  sentences,  and  that  the  proper  pronominali- 
zation  in  such  cases  is  “they,”  not  “it.”  And,  of  course,  we  must  not 
overlook  the  fact  that  “bus”  can  serve  as  an  adjective  as  well  as  a 
noun:  “The  bus  driver  has  left”  should  not  lead  to  “Hasn’t  it?” 

After  an  analysis  of  the  kinds  of  nominal  constructions  that  can 
serve  as  subjects  of  English  sentences,  it  becomes  clear  that  syntactic 
analysis  routines  would  have  to  be  written  for  the  computer  that 
would  enable  it  to  deal  with  nearly  all  aspects  of  English  grammar. 

It  seems  to  be  impossible  to  tell  solely  from  the  formal  attributes  of 
any  word  in  a  sentence,  as  it  is  spoken  or  written,  what  grammatical 
function  that  word  may  be  playing  in  the  sentence.  As  linguists  are 
fond  of  reminding  us,  if.  is  necessary  to  penetrate  beyond  the  surface 
structure  of  the  sentence  to  the  deeper  relations  that  underlie  it.  I 
will  not  attempt  to  develop  this  argument  here,  but  will  merely  com¬ 
ment  that  this  penetration  beneath  the  surface  structure  requires 
considerable  knowledge  of  English  syntax  and  semantics;  enough 
knowledge,  in  fact,  that  we  do  not  yet  know  how  to  make  all  of  it^ 
sufficiently  clear  and  explicit  for  a  computing  machine.  Which  is  a  v 
large  part  of  the  reason  that  machines  have,  thus  far,  been  inade¬ 
quate  to  deal  with  human  language  the  same  way  human  beings  do. 
But  we  are  learning,  and  as  we  learn  to  write  better  programs,  the 
computers  will  improve  their  performance. 

SEMANTICS 

Now  what  can  be  said  about  semantics?  Anyone  who  aspires  to  the 
scientific  study  of  semantics  will  soon  discover  that  he  has  almost 
no  theoretical  basis  from  which  to  begin.  Whereas  phonological 
studies  of  the  sou  d  patterns  of  spoken  languages  have  been  well 
formulated  and  intensively  studied  for  many  years,  and  linguistic 
an  '  psychological  studies  of  syntax  are  currently  developing  an 
interesting  and  respectable  body  c!  vntific  knowledge,  the  major 
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impression  you  get  from  a  review  of  research  in  semantics  is  of  over¬ 
whelming  diversity  and  heterogeneity.  The  problem  is  not  just  that 
we  have  no  theory  to  build  on;  there  seems  to  be  almost  no  con¬ 
sensus  concerning  the  range  of  phenomena  for  which  a  theory 
should  be  constructed.  What  we  colloquially  call  “meaning”  is  no 
simple,  homogeneous  thing.  Several  different  problems  are  con¬ 
founded  together  under  this  heading.  In  this  situation,  therefore, 
it  is  advisable  to  begin  by  narrowing  down  the  subject  a  bit  with 
some  terminological  distinctions. 

Philosophers  generally  distinguish  two  aspects  of  semantics: 
reference  and  meaning.  Reference  is  concerned  with  relations  be¬ 
tween  linguistic  symbols  and  other,  usually  nonlinguistic,  entities, 
states,  or  processes;  a  theory  of  reference  must  deal  with  such  mat¬ 
ters  as  naming,  truth,  extension.  Meaning,  in  this  context,  is  con¬ 
cerned  with  relations  among  linguistic  symbols;  a  theory  of  meaning 
must  deal  with  such  matters  as  significance,  synonymity,  analyticity, 
intension.  A  reader  interested  in  pursuing  these  topics  in  a  phil¬ 
osophical  vein  might  be  well  advised  to  take  J.  J.  Katz’s  The  Philos¬ 
ophy  of  Language  as  a  starting  point.  There  is  some  disagreement 
among  philosophers  as  to  whether  meaning  could  be  reduced  to 
reference  if  the  theories  were  properly  formulated,  but  I  will  not 
try  to  judge  the  merits  of  this  argument;  here  I  will  respect  the 
usual  referential-ideational  dichotomy  for  its  didactic  value. 

Those  not  familiar  with  this  distinction  may  find  that  a  simple 
example,  modeled  after  Frege,  will  suggest  what  is  involved.  Con¬ 
sider  the  sentence,  “Robert  McNamara  is  Secretary  of  Defense.”  As 
of  June  1967  the  referent  of  the  name  “Robert  McNamara”  and  the 
referent  of  the  name  “Secretary  of  Defense”  are  identical.  In  spite 
of  the  fact  that  both  names  refer  to  the  same  person,  however,  they 
do  not  have  the  same  meaning.  If  they  did  have  the  same  meaning 
as  well  as  the  same  reference,  then  the  sentence  “Robert  McNamara 
is  Secretary  of  Defense”  arid  “Robert  McNamara  is  Robert  Mc¬ 
Namara”  would  have  exactly  the  same  significance.  Since  they  do 
not  have  the  same  significance,  it  is  necessary  to  distinguish  between 
reference  and  meaning.  This  distinction  enables  us  to  conclude  that 
two  words  that  refer  to  the  same  thing  need  not  have  the  same 
meaning. 

Now,  reference  is  obviously  important  for  human  language,  but 
it  is  not  a  unique  feature  of  human  language.  Many  nonlinguistic 
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stimuli  can  function  as  signals  by  virtue  of  associations  that  give 
them  referential  character;  I  have  no  doubt  that  such  referential 
associations  can  be  learned  by  many  animals  other  than  Homo 
sapiens.  Such  associations  are,  of  course,  essential  for  language. 
However,  according  to  the  position  taken  here,  it  is  not  reference, 
but  predication  that  sets  human  language  apart  from  the  signal 
systems  used  by  other  animals.  When  we  try  to  establish  a  semantic 
basis  for  predication  we  are  led  into  problems  that  belong  directly 
to  a  theory  of  meaning,  and  only  indirectly  to  a  theory  of  reference. 

Predication  is  the  affirmation  of  a  comment  about  a  topic.  If  we 
take  as  a  central  task  of  semantics  to  explain  how  such  predicated 
relations  are  interpreted,  then  we  are  confronted  with  a  problem  in 
the  theory  of  meaning.  Presumably,  interpretation  of  a  subject- 
predicate  relation  must  be  characterized  somehow  in  terms  of  the 
interpretations  of  its  parts  and  the  manner  of  their  combination. 
That  is  to  say,  we  would  like  to  characterize  the  meanings  of  its 
constituents  in  such  a  manner  that,  when  they  are  combined,  the 
meaning  of  the  sentence  will  be  projected  automatically  by  a  par¬ 
ticular  rule  for  combining  a  subject  meaning  with  a  predicate 
meaning.  Perhaps  it  would  not  be  too  misleading  to  think  of  this 
characterization  as  if  it  were  a  problem  in  mental  chemistry;  the 
elements  must  be  characterized  in  such  a  way  as  to  enable  us  to 
predict  which  of  them  can  be  combined,  and  what  the  result  of 
their  combinations  will  be.  In  order  to  accomplish  this,  there  must 
be  a  structure  underlying  the  lexicon,  just  as  there  is  an  underlying 
structure  behind  the  chemical  table  of  elements. 

The  difficulty  with  this  chemical  metaphor,  from  a  psychological 
point  of  view,  is  that  a  limitless  variety  of  presuppositions-facts 
familiar  to  and  taken  for  granted  by  both  talker  and  listener,  yet 
not  actually  expressed  in  the  sentence— can  play  a  role  in  under¬ 
standing  the  sentence.  Only  in  special  cases  can  the  meaning  of  a 
sentential  compound  be  specified  completely  in  terms  of  its  semantic 
elements  and  their  syntactic  interrelations;  usually  information  is 
invoked  that  has  no  place  in  either  the  lexicon  or  grammar  of 
English.  Speech  can  be  context  free,  but  usually  it  is  not;  it  is 
almost  never  free  of  shared  presuppositions.  A  better  formulation 
for  a  psychologist  is  that  a  sentence  does  not  contain  its  speaker’s 
meaning  as  a  sponge  contains  water;  rather  it  provides  some  in¬ 
formation  that  a  listener  can  use  in  constructing  a  meaning  of  his 
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own.  If  this  conception  of  listening  (as  a  creative  process)  is  accepted, 
then  a  complete  analysis  of  sentence  meanings  into  word  meanings 
becomes  very  difficult,  if  not  impossible. 

Whether  or  not  a  complete  psychological  account  of  sentence  in¬ 
terpretation  is  possible,  however,  there  is  a  more  modest  goal  that 
we  might  hope  to  attain.  Sentence  interpretation  is  not  capriciously 
related  to  lexical  meaning  and  syntactic  function;  contexts  and  pre¬ 
suppositions  can  be  important,  but  they  are  not  always  and  every¬ 
where  the  only  determining  factors  for  interpretation.  Lexicon  and 
grammar  obviously  contribute  something;  their  contribution  can 
be  isolated  from  the  total  psychological  process  and  studied  in  its 
own  right.  Katz  and  Fodor  have  suggested  that  we  imagine  finding 
an  unmarked  envelope  containing  a  sheet  on  which  a  single  sen¬ 
tence  is  written,  devoid  of  any  indication  of  source,  destination,  con¬ 
text,  or  presupposition.  Such  a  context-free  sentence,  if  it  is  gram¬ 
matical  and  composed  of  familiar  words,  will  not  be  completely 
unintelligible  to  a  person  who  knows  the  language.  Of  course,  end- 

_ less  subleties  of  interpretation  may  be  added  when  the  sentence  is 

put  into  a  context  of  actual  use,  but  that  does  not  alter  the  fact  that 
some  interpretation  can  be  made  even  in  the  absence  of  contexual 
knowledge.  It  is  this  reduced,  but  not  unimportant  process  of  con¬ 
text-free  interpretation  that  we  have  in  mind  when  we  resort  to  a 
chemical  metaphor. 

Within  these  limitations,  then,  a  central  problem  for  a  theory  of 
meaning  is  to  explain  how  the  meaning  of  a  grammatical  compound 
can  be  derived  from,  or  characterized  in  terms  of,  the  meanings  of 
its  constituent  elements. 

One  approach  to  this  task  is  to  ask  how  the  meanings  of  the 
elements  might  be  characterized.  The  most  familiar  answer,  of 
course,  is  given  by  lexicographers  in  the  form  of  a  dictionary,  where 
the  meanings  of  words  are  characterized  in  terms  of  explanatory 
phrases  and/or  mutually  substitutable  expressions.  Some  such  an¬ 
swer  is  a  necessary  part  of  any  semantic  theory  of  a  language,  al¬ 
though  in  our  theories  we  would  probably  like  to  make  the  relations 
between  entries  more  obvious,  and  we  would  certainly  like  to  be 
more  explicit  about  the  rules  for  interpreting  combinations  of 
elements.  Our  theory,  in  short,  should  systematically  display  and 
exploit  the  cognitive  structure  underlying  the  lexicon. 

What  guarantee  do  we  have  that  the  lexicon  has  structure,  or  that 
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any  simplification  or  regularization  of  it  will  be  possible?  The  fact 
that  little  children  must  learn  the  lexicon  (and  adults  remember 
it)  argues  that  it  must  be  simpler  than  it  looks;  no  one  learns  the 
meanings  of  words  by  memorizing  dictionary  entries  as  if  they  were 
independent  and  unrelated  items  in  a  paired-associates  learning 
experiment.  The  need  for  cognitive  economy  in  remembering  and 
thinking  argues  that  there  must  be  some  simpler  system  of  concepts 
and  relations  underlying  the  apparent  heterogeneity  of  the  dic¬ 
tionary. 

Remember  where  we  have  come.  We  asked  about  the  communica¬ 
tion  of  meaning.  Our  first  step  was  to  rephrase  this  question  in 
terms  of  the  interpretation  of  predications,  and  our  second  step  was 
to  ask  for  an  analytic  answer,  i.e.,  an  account  of  sentence  interpreta¬ 
tion  expressed  in  terms  of  the  meanings  assigned  to  the  component 
elements  of  a  predication.  The  meanings  that  would  be  assigned 
to  component  elements  by  a  dictionary  are  necessary  but  not  suffi¬ 
cient  for  psychological  purposes;  they  make  no  pretense  of  represent¬ 
ing  the  cognitive  system  into  which  these  elements  fit.  Behind  the 
formal  lexicon  compiled  for  purposes  of  linguistic  description  there 
must  be  a  psychological  lexicon  "in  the  head”  of  the  language  user; 
this  subjective  lexical  competence  bears  little  if  any  resemblance  to 
an  alphabetic  listing  of  words  along  with  their  associated  pronun¬ 
ciations  and  definitions.  But  what  does  it  resemble?  If  we  tried  to 
imagine  what  an  entry  in  the  psychological  lexicon  might  be  like, 
we  would  probably  propose  some  kind  of  triadic  constellation  that 
included  conceptual,  imaginal  (perceptual  or  memory  images),  and 
symbolic  aspects.  Our  concept  of  a  frog,  our  imagery  of  frogs,  and 
the  symbol  “frog"  are  somehow  integrated  into  a  psycholexical 
entity.  The  psychology  of  reference  concerns  the  relation  between 
imagery  and  symbol;  the  psychology  of  meaning  concerns  the  re¬ 
lation  between  concept  and  symbol.  And— most  important— each 
of  these  complex  concept-image-symbol  entities  is  related  to,  or 
associated  with,  many  other  similar  complex  entities  in  some  sys¬ 
temic  way  that  we  would  like  to  be  able  to  describe.  If  the  imaginal 
aspects  can  be  set  aside,  at  least  temporarily,  we  might  hope  for 
some  description  of  the  conceptual  relations  among  symbols. 

We  would  hope  to  make  explicit  the  psychological  structure  of 
the  lexicon  in  such  a  way  that  (context-free)  meanings  of  gram¬ 
matical  compounds  could  be  inferred  directly  from  semantic  specifi* 
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cations  given  for  constituent  elements  and  from  the  manner  of 
their  combination.  We  are  still  very  far  from  having  such  a  theory 
for  any  natural  language;  most  of  the  discussion  so  far  has  been 
concerned  with  the  general  form  that  such  a  theory  might  take  if  we 
did  have  it.  Various  ways  to  accomplish  the  general  aim  have  been 
proposed  and  explored  in  a  preliminary  fashion. 

Theoretical  alternatives.  Consider  a  spatial  representative  of 
meanings— not  because  it  is  correct,  but  because  it  offers  a  frame  of 
reference  within  which  the  problem  of  theory  construction  can  be 
discussed.  In  a  spatial  representation,  for  example,  we  might  imagine 
that  any  particular  meaning  is  a  point  in  some  cognitive  hyperspace, 
its  position  being  determined  by  its  values  on  the  set  of  orthogonal 
axes  defining  the  space.  Then  the  meaning  of  a  sentence  that  con¬ 
tains  several  words  might  also  be  a  point  in  the  hyperspace  com¬ 
puted  from  the  positions  of  its  elements  (e.g.,  their  center  of  gravity). 
Or  the  sentence  might  be  represented  by  some  more  complicated 
entity  (e.g.,  a  vector,  or  a  directed  graph  through  the  component 
points,  etc.)  Or,  if  a  metric  space  seemed  inappropriate,  we  might 
consider  some  more  discrete  kind  of  “space”  having,  say,  only  a 
finite  number  of  values  (usually  only  two)  along  each  axis;  there 
might  be  various  abstract  ways  to  compound  the  spaces  for  indi¬ 
vidual  words  into  spaces  appropriate  for  phrases  and  whole  sen¬ 
tences.  There  would  be  a  question  as-  to  whether  the  axes  of  such 
a  model  should  themselves  be  words  and/or  phrases  in  the  language, 
or  whether  it  is  better  to  regard  the  axes  as  purely  abstract  concep¬ 
tions  invented  by  the  semanticist  for  the  convenience  of  his  own 
theory. 

Since  there  is  no  general  agreement  about  the  correct  strategy  to 
follow  here,  it  is  difficult  to  discuss  the  problem  intelligibly  at  such 
an  abstract  level;  it  is  difficult  to  say  something  substantive  without 
saying  more  than  is  justified.  Perhaps  the  best  way  to  give  a  clearer 
impression  of  the  conceptual  possibilities  and  difficulties  is  to 
mention  some  examples. 

Social  anthropologists  who  have  been  concerned  with  this  seman¬ 
tic  problem  have  developed  something  called  “componential  an¬ 
alysis."*  A  semantic  component  is  a  feature  or— to  stay  with  the 
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*  Different  theorists  have  slightly  different  interpretations  of  componential 
analysis;  what  is  said  here  will  be  right  in  general  conception  but  probably  wrong 
in  specific  detail. 
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spatial  model— an  axis  or  dimension,  usually  having  a  discrete  num¬ 
ber  of  values.  For  example,  sex  might  be  a  semantic  component;  it 
would  have  two  values,  male  and  female;  such  words  as  “man,” 
"bull,”  "son,"  etc.,  would  all  have  one  value,  and  such  words  as 
"woman,”  “cow,”  “daughter,’  etc.,  would  all  have  the  other  value. 
The  general  aim,  of  course,  is  to  select  several  such  semantic  compo¬ 
nents  in  such  a  way  that  each  entry  in  the  lexicon  would  have  its 
own  unique  vector  of  values  on  the  several  dimensions,  and  entries 
that  seemed  similar  in  meaning  would  share  more  values  in  com¬ 
mon  than  would  entries  whose  meanings  seemed  unrelated. 

When  they  began  to  work  with  sets  of  semantic  components,  social 
anthropologists  found  it  necessary  to  distinguish  two  different  possi¬ 
bilities,  the  paradigmatic  and  the  taxonomic.  In  a  paradigmatic 
system,  insofar  as  possible,  every  term  is  given  a  value  on  every 
component.  If  all  the  components  were  binary,  this  would  mean 
that  n  components  could  characterize  2n  different  items,  which 
would  be  a  very  efficient  way  to  code  meanings. 

Consider  a  paradigmatic  example.  In  Fig.  4  a  table  is  given  show- 
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Figure  4.  Matrix  representation  of  a  paradigmatic  semantic  system  for 
English  kinship  terms. 

ing  how  three  semantic  components  might  be  used  to  define  eleven 
different  terms  in  the  English  system  of  kinship  terminology.  In 
Fig.  5  this  same  classification  is  given  a  spatial  representation.  An¬ 
thropologists  for  various  reasons  are  much  interested  in  kinship — it 
is  related  to  marriage  practices,  family  and  tribal  structure,  eco¬ 
nomic  relations  and  religious  beliefs,  etc.-so  this  kind  of  paradig¬ 
matic  specification  is  as  important  as  it  is  economical. 

Unfortunately,  however,  |>aradigmutic  systems  seem  to  be  the 
exception  rather  than  the  rule.  In  most  cases  a  taxonomic  structure 
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Figure  5.  Spatial  representation  of  the  paradigmatic  system  given  in  Fig.  4. 

is  all  that  can  be  established.  In  a  taxonomic  system,  not  every  term 
can  be  given  a  value  on  every  component;  e.g.,  if  sex  is  to  be  a 
semantic  component,  would  "tree”  be  male  or  female?  Still,  there 
may  be  a  hierarchy  among  the  dimensions  of  a  sort  that  charac¬ 
teristically  leads  to  taxonomic  trees.  In  order  to  distinguish  paradig¬ 
matic  dimensions  from  taxonomic  dimensions,  let  me  call  the 
former  "semantic  components"  and  the  latter  "semantic  markers." 

How  a  taxonomic  tree  can  result  from  a  system  of  semantic 
markers  is  illustrated  in  Fig.  0.  Here  we  again  have  a  matrix  with 
items  across  the  top,  dimensions  down  the  margin,  and  cell  entries 
indicating  the  value  of  the  particular  item  on  the  particular  dimen¬ 
sion.  In  this  case,  however,  many  of  the  cell  entries  are  blank,  which 
should  be  interpreted  to  mean  that  dimension  is  simply  not  relevant 
to  that  item.  For  example,  it  is  simply  not  relevant  to  ask  whether 
"fear"  should  be  marked  as  living  or  nonliving,  so  that  cell  is  left 
blank.  The  result  of  this  interaction  between  dimensions  and  items 
is  that  the  dimensions  are  not  used  with  maximum  efficiency;  in 
terms  of  information  theory,  semantic  markers  provide  a  redundant 
coding  system.  The  nature  of  this  redundancy  is  spelled  out  at  the 
bottom  of  Fig.  6,  where,  for  instance,  it  is  noted  that  every  item  that 
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Figure  6.  Matrix  representation  of  a  taxonomic  semantic  system  for  Eng¬ 
lish  nouns.  The  redundancy  rules  should  be  interpreted  us  follows:  “Any 
word  Unit  is  marked  either  as  living  or  nonliving  is  marked  +  for  object " 
etc. 

is  marked  for  L  ( living  vs.  nonliving)  is  also  marked  -f  on  O  (is  an 
object):  that  is  to  say,  if  any  item  is  marked  for  L,  we  know  auto¬ 
matically  how  it  will  be  marked  for  0,  Similarly,  anything  marked 
for  V  (plant  vs-  animal)  will  be  marked  -{-on  L  (will  he  living),  etc. 

When  we  examine  these  redundancy  rules,  we  find  that  they  can 
be  summarised  in  a  tree  graph  as  shown  in  Fig.  ?.  For  example,  the 
word  “tiger*'  is  marked  4.  on  F  (feral  vs.  donmtieated)\  the  tree 
graph  tells  us  that  “tiger”  is  also  subhuman,  animal .  living,  and 
object.  Tire  redundancy  rules,  therefore,  can  he  interpreted  as 
representing  what  a  tanguage  user  knows  about  the  structure  of 
the  lexicon,  a  kind  of  basic  semantic  framework  into  which  new 
terms  can  be  assimilated  as  they  are  learned. 

Semantic  components  and  semantic  markers,  as  these  terms  are 
generally  used,  are  abstract  dimensions:  they  may  have  simple  and 
appropriate  names,  hut  if  so,  Uni  fact  is  irrelevant  and  unnecessary. 
Components  and  markers  provide  a  conceptual  framework  within 
which  lexical  items  can  be  located,  much  as  points  ate  located  in  a 
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Ficuwe  7.  The  redundancy  rules  of  Fig.  6  are  here  redrawn  to  display  more 
clearly  their  hierarchic  structure. 

space.  There  Is,  however,  a  slightly  different  way  to  think  of  these 
relations,  a  way  that  may  be  more  agreeable  to  psychologists,  since 
it  scents  closer  to  thejr  traditional  concept  of  association.  We  can 
think  of  them  in  terms  of  associations  between  words.  This  approach 
is  perhaps  one  step  closer  to  the  usual  lexicographic  practice  of 
defining  one  word  in  terras  of  other  words 

Suppose  we  were  to  think  of  semantic  markers  as  themselves 
being  items  in  the  lexicon,  and  instead  of  imagining  them  to  he 
dimensions  in  a  space,  suppose  we  were  to  assume  a  special  kind 
of  association  that  has  beeu  learned  between  them  and  the  words 
they  mark.  One  very  important  instance  of  this  sjtccial  association 
would  be  the  asymmetric  inclusion  relation.  Under  this  interpreta¬ 
tion  we  can  look  at  the  taxonomic  tree  in  a  slightly  different  way, 
as  indicated  in  Fig,  8,  where  all  the  entries  are  related  by  the  same 
kind  of  inclusion  association.  Here,  for  example,  there  is  an  associ¬ 
ation  between  "tree"  and  "plant**  of  the  kind  we  call  inclusion,  and 
another  inclusion  association  lias  been  learned  between  "plant**  and 
"living",  etc. 

Fig,  8  also  represents  an  associative  hierarchy  in  tire  form  of  a  list 
structure.  As  everyone  familiar  with  contemporary  computer  pro¬ 
gramming  knows,  the  matrix  and  the  list  ate  the  two  bask  mode*  for 
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FRAGMENT  OF  AN  INCLUSION  LIST  STRUCTURE 
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Ficurk  8.  The  hierarchic  taxonomy  defined  on  nouns  by  an  inclusion 
relation  is  represented  both  as  a  list  and  as  a  tree  graph.  The  nodes  are 
h'  ■'£  labelled  with  lexical  entries,  rather  titan  abstract  concepts,  as  in  Fig.  7. 


organizing  the  memory  of  a  computet  system.  Either  matrix  or  list 
structures  can  be  used  to  represent  either  a  semantic  marker  or  a 
semantic  association  model  of  the  lexicon.  However,  1  believe  that 
anyone  familiar  with  list-processing  languages  would  be  strongly 
attracted  to  the  list  as  die  appropriate  form  of  organization  for  this 
kind  oi  lexical  information.  List  v$.  matrix,  however,  is  a  tactical 
question;  since  we  still  have  important  strategic  questions  unsettled, 
I  do  not  wish  to  argue  the  (mint. 

An  associative  model,  coded  as  a  list  structure,  however,  could 
also  he  used  to  organize  the  same  lexical  items  in  terms  of  more  than 
one  tyjsc  of  association.  In  addition  to  the  inclusion  illustrated  in 
Fig.  8,  we  might  also  want  to  have  a  part-whole  relation  of  the  sort 
illustrated  in  Fig.  9.  litis  tree  graph  (or  list  structure)  represents  the 
fact  that  a  '‘chin"  is  a  part  of  a  “face,**  which  is  a  part  of  a  “head,** 
which  is  part  of  a  “body,"  which  is  part  of  a  “person“-relaiions  that 
must  he  known  to  anyone  who  knows  what  these  words  mean.  A 
part-whole  association  gives  us  a  kind  of  hierarchic  inventory  of 
parts.  But  note  that  the  strict  concept  of  hierarchy  may  be  sacrificed, 
as  when  “neck"  is  judged  to  he  part  oi  both  “head"  aud  “torso**;  as 
long  as  no  loops  are  permitted,  the  structure  Is  still  weakly  hierarchic. 

Inclusion  relations  and  part-whole  relations  are  closely  tied  to 
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FRAGMENT  OF  AN  INVENTORY  LIST  STRUCTURE 


PERSON 


HEAD  TORSO  ARM  LEG 


/\ 

PACE  EAR  HAIR  NECK  CHEST  BACK  HAND  WRIST 

/w 

CHIN  CHEEK  EYE  NOSE  BROW  PALM  THUMB  FINGERS 

Figure  9.  A  list  structure  can  also  be  defined  by  a  part-whole  relation.  Not*; 
the  presence  of  "neck"  at  two  places  on  the  list, 

important  forms  of  predication.  For  example,  the  inclusion  of  ‘‘tree’’ 
in  ‘'plant”  is  expressed  by  the  (analytic)  proposition  “A  tree  is  a 
plant."  If  the  relation  is  reversed,  as  in  "A  plant  is  a  tree,"  it  is 
incorrect.  Similarly,  the  part-whole  relation  between  "chin''  and 
“face”  is  expressed  by  the  (analytic?)  proposition  that  "A  face  has 
a  chin.”  Again,  if  the  relation  is  reversed,: as  in  “A  chin  has  a  face," 
we  feel  that  something  odd  has  been  said.  This  connection  of  inclu¬ 
sion  relations  and  part-whole  relations  with  particular  types  of  pred¬ 
ication  is,  of  course,  a  familiar  fact  to  those  who  follow  philosophical 
analyses  of  language.  Its  relevance  here  is  that  it  suggests  a  way  to 
look  at  the  structure  of  else  lexicon  in  terms  of  predication:  it  is  as 
if  the  verbs  (predicates)  imposed  a  conceptual  structure  on  the 
noum  (subjects). 

These  approaches  to  semantic  analysis  should  give  some  fettling 
for  the  kind  of  theoretical  struggles  that  are  currently  going  on.  The 
survey  is  certainly  not  exhaustive,  and  undoubtedly  not  unbiased 
by  my  personal  interests  and  research.  For  example,  I  have  not  even 
mentioned  the  most  famous  attempt  to  summarise  the  conceptual 
structure  of  the  lexicon,  namely,  Rogers  Thesaurus  of  English  Words 
and  Phrases  Classified  and  Anangcd  as-  to  Facilitate  Ike  Explosion 
of  f dew  md  to  Assist  in  Li treaty  Composition,  Roget  harjhtUe  of 
theoretical  value  to  offer  us.  tint  if  Roget  was  a  bit  short  on  theory, 
he  made  up  for  it  in  energy;  he  posited  his  classification  system 
through  the  whole  lexitem— an  impressive  and  surprisingly  useful 
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enterprise.  If  we  want  to  improve  on  Roget,  we  need  not  only  a 
better  theory,  but  also  a  more  systematic  method  of  collecting  data. 
Roget  presumably  relied  on  his  own  intuitions  about  semantic  simi¬ 
larity.  That  may  or  may  not  be  the  best  method,  it  is  certainly  not 
the  only  method. 

Empirical  methods .  Methodology,  the  bread-and-butter  of  a  scien¬ 
tist  working  in  any  given  field,  is  usually  spinach  to  those  outside.  I 
will  try  to  keep  my  methodological  remarks  as  brief  as  possible. 

We  now  have  some  notion  of  what  a  theory  of  the  interpretation 
of  sentences  might  look  like,  and  some  glimmerings  of  the  kind  of 
lexical  information  about  constituent  elements  that  would  be  re¬ 
quired.  It  is  quite  difficult,  however,  to  launch  directly  into  the 
compilation  of  a  lexicon  along  the  lines  suggested  by  this  theory;  so 
many  apparently  arbitrary  decisions  are  involved  that  it  becomes 
advisable  to  try  to  verify  them  somehow  as  we  go.  One  important 
kind  of  verification,  of  course,  is  given  by  a  theorist's  own  intuitions 
as  a  native  speaker  of  the  language;  such  intuition  is  probably  the 
ultimate  court  of  appeal  in  any  case.  But,  if  possible,  it  would  be 
highly  desirable  to  have  some  more  objective  method  for  tapping 
the  intuition  of  language  users,  especially  if  many  people  could  pool 
their  opinions  and  if  information  about  many  different  words  could 
be  collected  -and  analyzed  rapidly.  A  number  of  efforts  have  been 
made  to  devise  such  methods. 

As  far  as  I  am  aware,  however,  no  objective  methods  for  the  direct 
appraisal  of  semantic  contents  have  yet  been  devised,  either  by 
linguists  or  psychologists.  What  has  bee/,  done  instead  is  to  investi¬ 
gate  the  semantic  distances  that  are  implied  by  the  kind  of  spatial 
and  sernisputial  representations  wc  have  just  reviewed.  The  hope 
is  that  from  a  measurement  of  the  distances  Itciween  concepts; we 
can  infer  something  about  the  coordinates  of  their  universe,  ins¬ 
tance  can  be  related  to  similarity,  and  judgments  of  similarity  Of 
meaning  hre  relatively  easy  to  get  ant*  to  analyse.  Several  methods 
are  available. 

The  empirical  problem  is  this,  it  is  trot  difficult  to  devise  ways 
to  estimate  semantic  similarities  among  words.  On  the  basis  of 
such  judgments  of.  similarity  we  would  like  to  construct,  or  at  least 
test,  theoretical  descriptive  schemes  of  the  soft  just reviewed.  Any 
large-scale  empirical;  attack  on  the  juoblein  would  involve  three 
steps:  ..  :  .'ri  > 
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(1)  Devise  an  appropriate  method  to  estimate  semantic  similarities 
and  use  it  to  obtain  data  from  many  judges  about  a  large  sample  of 
lexical  items.  These  data  form  a  symmetric  items-by-items  matrix, 
where  each  cell  a{}  is  a  measure  of  the  semantic  similarity  between 
item  i  and 

(2)  Devise  an  appropriate  method  to  explore  the  structure  under¬ 
lying  the  data  matrix.  Among  the  analytic  tools  already  available, 
factor  analysis  has  been  most  frequently  used,  but  various  alterna¬ 
tives  are  available.  In  my  research  I  have  used  a  method  of  cluster 
analysis  that  seems  to  work  rather  well,  but  improvements  would  be 
possible  if  we  had  a  soiid  theoretical  basis  for  preferring  one  kind  of 
representation  over  all  others. 

(3)  Identify  the  factors  or  clusters  in  terms  of  the  concepts  in  a 
semantic  theory.  In  most  cases,  this  is  merely  a  matter  of  finding 
appropriate  names  for  the  factors  or  clusters.  Given  the  backward 
state  of  semantic  theory  at  present,  however,  this  step  is  almost  not 
worth  taking.  Eventually,  of  course,  we  would  hope  to  have  semantic 
descriptions,  in  terms  of  marker  or  list  structures,  say,  from  which 
we  could  not  only  construct  sentence  interpretations,  but  could  also 
predict  similarity  data  for  any  set  of  lexical  items.  At  present,  how¬ 
ever,  we  hi.  c  not  reached  that  stage  of  sophistication. 

Most  of  the  energy  that  psychologists  have  invested  in  this  prob¬ 
lem  so  far  has  gone  into  step  (1),  the  development  of  methods  to 
measure  semantic  similarity.  However,  since  there  is  no  generally 
accepted  method  of  analysis  or  established  theory  against  which  to 
validate  such  measures,  it  is  not  easy  to  sec  why  one  method  of  data 
collection  should  be  preferred  over  the  others.  But,  in  spite  of  the 
sometimes  vicious  circularity  of  this  situation,  I  think  we  arc  slowly 
making  some  headway  toward  meaningful  methods. 

There  are  four  general  methods  that  psychologists  have  used  to 
investigate  similarities  among  semantic  atoms:  (1)  sealing,  (2)  associ¬ 
ation,  (3)  substitution,  and  (4)  classification,  1  myself  have  worked 
primarily  with  classification,  but  I  should  mention  briefly  what 
alternative  procedures  are  available.  Where  possible,  examples  of 
the  methods  will  be  cited,  but  no  attempt  at  a  thorough  review  is 
contemplated. 

(I)  The  method  of  subjective  seating  known  as  magnitude  estima¬ 
tion,  as  described  by  S.  Stevens  in  numerous  publications,  sug¬ 
gests  itself  as  a  simple  and  direct  method  to  obtain  a  matrix  of 
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semi  ntic  similarity  scores.  So  far  as  I  know,  this  method  has  t  ot 
been  used  in  any  systematic  search  for  s<-  nantic  features. 

Scaling  methods  have  been  used  in  psychometric  research.  Mosier 
used  ratings  to  scale  evaluative  adjectives  along  a  favorable-unfa¬ 
vorable  continuum,  and  Cliff  used  them  o  argue  that  adverbs  of 
degree  act  as  multipliers  for  the  adjectives  they  modify.  However, 
these  studies  did  not  attempt  to  construct  a  general  matrix  of 
similarity  measures  for  a  large  sample  of  the  lexicon,  or  to  discover 
new  semantic  features. 

One  example  of  the  use  ot  scaling  is  a  study  reported  by  Ruben- 
stein  and  Goodenough.  They  asked  people  to  rate  pairs  of  nouns  for 
their  “similarity  of  meaning.”  They  used  a  five-point  scale,  where 
zero  indicated  the  lowest  degree  of  synonymy  and  four  the  highest. 
Their  averaged  results  for  65  pairs  of  words  included  the  following: 


cord 

smile 

0.02 

cushion 

jewel 

0.45 

forest 

graveyard 

1.00 

hill 

woodland 

1.48 

magician 

oracle 

1.82 

sage 

wizard 

2.46 

asylum 

madhouse 

3.04 

serf 

slave 

3.46 

midday 

noon 

3.94 

Although  Rubenstein  and  Goodenough  did  not  obtain  a  complete 
matrix  of  all  comparisons  among  the  48  words  they  used,  then- 
results  indicate  that  meaningful  estimates  can  be  obtained  by  this 
technique. 

A  difficulty  that  any  procedure  must  face  is  that  a  truly  enormous 
amount  of  data  is  required.  If  a  lexicon  is  to  contain,  say  10°  word 
senses,  then  the  similarity  matrix  will  have  10,a  cells  to  be  filled.  It 
is  obvious  immediately  that  any  empirical  approach  must  settle  for 
judgments  on  strategic  groups  of  word  items  selected  from  the  total 
lexicon.  But  even  with  that  necessary  restriction,  the  problem  is 
difficult.  If,  for  example,  we  decide  to  work  with  100  items  in  some 
particular  investigation,  there  are  still  4950  pairs  that  have  to  be 
judged.  If  we  want  several  judges  to  do  the  task,  and  each  judge  to 
replicate  his  data  several  times,  die  magnitude  of  the  data  collection 
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process  becomes  truly  imposing.  It  is  doubtful  that  judges  could 
maintain  their  interest  in  the  task  for  the  necessary  period  of  time. 
If  a  multidimensional  scaling  procedure  is  used  in  which  judges 
decide  which  two  of  three  items  are  most  similar,  the  number  of 
judgments  required  is  even  greater:  100  items  give  161,700  triplets 
to  be  judged. 

For  this  reason,  scaling  procedures  do  not  seem  feasible  for  any 
large  survey  of  semantic  items;  preference  must  be  given  to  methods 
that  confront  a  judge  with  the  items  one  at  a  time,  where  similarity 
is  estimated  on  the  basis  of  the  similarities  of  his  responses  to  the 
individual  items— thus  avoiding  the  data  explosion  that  occurs  when 
he  must  judge  ail  possible  pairs,  or  all  possible  triplets.  Scaling 
methods  should  probably  be  reserved  for  those  cases  where  we  want 
a  particularly  accurate  study  of  a  relatively  small  number  of  items. 

(2)  Because  of  the  historical  importance  of  association  in  philo¬ 
sophical  theories  of  psychology,  more  work  on  semantic  similarity 
has  been  done  with  associative  methods  than  with  any  other.  This 
work  has  been  reviewed  by  Creelman  and  also  by  Deese,  who  made 
it  the  starting  point  for  a  general  investigation  of  what  he  calls 
“associative  meaning.” 

In  the  most  familiar  form  of  the  associative  method,  people  are 
asked  to  say  (or  write)  the  first  word  they  think  of  when  they  hear 
(or  read)  a  particular  stimulus  word.  When  given  to  a  large  group  of 
people,  the  results  can  be  tabulated  in  the  form  of  a  frequency 
distribution,  starting  with  the  most  common  response  and  proceed¬ 
ing  down  to  those  idiosyncratic  responses  given  by  only  a  single 
person.  Then  the  similarity  of  two  stimulus  words  is  estimated  by 
observing  the  degree  to  which  their  response  distributions  coincide. 

The  procedure  for  estimating  the  degree  of  similarity  from  two 
response  distributions  is  a  very  general  one  that  has  been  used  in 
one  form  or  another  by  many  workers.  The  logic  behind  it  is  to 
express  the  measure  of  similarity  as  a  ratio  of  some  measure  of  the 
intersection  to  some  measure  of  the  union  of  the  two  distributions. 
In  the  case  of  word  associations,  the  responses  to  one  word  consti¬ 
tute  otic  set,  the  responses  to  another  word  another.  The  intersec¬ 
tion  of  these  two  sets  consists  of  all  responses  that  are  common  to 
the  two;  the  union  is  generally  interpreted  to  be  the  maximum 
number  of  common  responses  that  could  have  occurred.  The  re¬ 
sulting  ratio  is  thus  a  number  between  zero  and  one. 
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The  argument  is  sufficiently  general  that  intersection-union  ratios 
can  be  used  in  many  situations  other  than  word  association  tests; 
their  use  in  studies  of  information  retrieval,  where  synonymy  must 
be  exploited  to  retrieve  all  documents  relevant  to  a  given  request, 
has  been  reviewed  by  Giuliano  and  Jones  and  by  Kuhns.  The  ratio 
has  been  invented  independently  by  various  workers,  for  in  one 
version  or  another  it  is  the  natural  thing  to  do  when  faced  with 
data  of  this  type.  Consequently,  it  has  been  given  different  names  in 
different  contexts— intersection  coefficient,  coefficient  of  association, 
overlap  measure,  etc.— and  minor  details  of  definition  and  calcula¬ 
tion  have  occasionally  been  explored,  though  rather  inconclusively. 

The  utility  of  an  intersection-union  ratio  is  that  estimates  of  sim¬ 
ilarity  can  be  obtained  without  actually  presenting  all  possible 
pairs  to  the  judges.  The  assumption  underlying  it,  of  course,  is  that 
similarity  of  response  reflects  similarity  of  meaning.  If,  as  some 
psychologists  have  argued,  the  meaning  of  any  stimulus  is  all  the 
responses  it  evokes,  this  argument  is  plausible.  But  the  notion  that 
the  meaning  of  a  word  is  all  the  other  words  it  makes  you  think 
of  should  not  be  accepted  without  some  reservations. 

The  principal  recommendation  for  a  word-association  technique 
is  its  convenience  of  administration;  it  is  generally  given  in  written 
form  to  large  groups  of  subjects  simultaneously.  The  method  gives 
some  information  about  semantic  features,  since  an  associated  word 
frequently  shares  several  semantic  features  of  the  presented  word, 
but  it  is  also  sensitive  to  syntactic  and  phonological  association. 
Attempts  have  been  made  to  classify  associates  as  either  syntagmatic 
or  paradigmatic,  but  the  results  have  been  equivocal,  e.g.,  if  storm 
elicits  cloud,  or  flower  elicits  garden,  is  the  response  to  be  attributed 
to  paradigmatic  semantic  similarity  or  to  a  familiar  sequential  con¬ 
struction?  The  method  is  sensitive  only  to  high  degrees  of  similarity 
in  meaning;  most  pairs  of  the  words  elicit  no  shared  responses  at  all. 
And  no  account  is  taken  of  the  different  senses  that  a  word  can  have; 
when,  for  example,  fly  is  associated  with  bird  and  also  with  bug,  we 
suspect  that  fly  has  been  given  in  different  senses  by  different  people, 
but  the  data  provide  no  way  to  separate  them. 

A  variation  on  the  association  technique  that  combines  it  with 
the  scaling  methods  has  been  developed  and  extensively  used  by 
Osgood,  who  constrains  a  judge’s  response  to  one  or  the  other  of 
two  antonymous  adjectives,  Several  pairs  of  adjectives  are  used  and 
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people  are  allowed  to  scale  the  strength  of  their  response.  By  con¬ 
straining  the  judge’s  responses  to  one  of  two  alternatives  Osgood 
obtains  for  all  his  stimulus  words  distributions  of  responses  that 
are  sufficiently  similar  that  he  can  correlate  them,  even  for  words 
quite  unrelated  in  meaning.  When  these  correlations  are  subjected 
to  factor  analysis,  a  three-dimensional  space  is  generally  obtained. 
The  position  of  all  the  stimulus  words  can  be  plotted  in  a  three- 
dimensional  space  defined  by  the  antonymous  adjectives.  It  is  not 
clear  that  these  three  dimensions  bear  any  simple  relation  to  seman¬ 
tic  markers— nor  does  Osgood  claim  they  should.  It  is  true  that  words 
near  one  another  in  the  space  often  share  certain  semantic  features, 
but  the  method  gives  little  hint  as  to  what  those  shared  features 
might  be.  Osgood’s  method  is  most  useful  for  analyzing  attitudinal 
factors  associated  with  a  word. 

(3)  Another  approach  uses  substitution  as  the  test  for  semantic 
similarity.  In  linguistics  a  technique  has  been  developed  called  “dis¬ 
tributional  analysis.’’  The  distributional  method  has  been  most 
highly  developed  in  the  work  of  Zelig  Harris. 

Consider  all  the  words  that  can  be  substituted  in  a  given  context, 
and  all  the  contexts  in  which  a  given  word  can  be  substituted.  A 
linguist  defines  the  distribution  of  a  word  as  the  list  of  contexts  into 
which  the  word  can  be  substituted;  the  distributional  similarity  of 
two  words  is  thus  the  extent  to  which  they  can  be  substituted  into 
the  same  contexts.  One  could  equally  well  consider  the  distribu¬ 
tional  similarity  of  two  contexts.  Here  again  an  intersection-union 
ratio  of  the  two  sets  can  provide  a  useful  measure  of  similarity. 

Closely  related  to  distributional  similarity  is  a  measure  based  on 
co-occurrence.  Co-occurrence  means  that  the  words  appear  together 
in  some  corpus,  where  “appear  together’’  may  be  defined  in  various 
ways,  e.g.,  both  words  occur  in  the  same  sentence.  We  can,  if  we 
prefer,  think  of  one  word  as  providing  tne  context  for  the  other, 
thus  making  the  distributional  aspect  explicit.  A  union-intersection 
measure  of  similarity  can  be  defined  by  taking  as  the  intersection 
the  number  of  times  the  words  actually  co-occurred,  and  as  the 
union  the  maximum  number  of  times  they  could  have  co-occurred, 
i.e.,  the  number  of  times  the  less  frequent  word  was  used  in  the 
corpus.  Co-occurrence  measures  have  the  advantage  that  they  can 
be  carried  out  automatically  by  a  properly  programmed  computer. 
Distributional  measures  can  in  general  be  made  automatic  if  a  very 
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large  corpus  is  available— large  enough  that  the  two  words  will  recur 
many  times. 

Several  psychologists  have  invented  or  adapted  variations  on  this 
distributional  theme  as  an  empirical  method  for  investigating  se¬ 
mantic  similarities.  It  is,  of  course,  the  basis  for  various  sentence 
completion  procedures  insofar  as  these  are  used  for  semantic  analysis. 
The  basic  assumption  on  which  the  method  rests  is  that  words  with 
similar  meanings  will  enjoy  the  same  privileges  of  occurrence,  i.e., 
will  be  substitutable  in  a  great  variety  of  contexts. 

For  example,  couch  and  sofa  can  be  substituted  interchangeably 
in  a  great  variety  of  contexts,  and  they  are  obviously  closely  related 
in  meaning.  In  terms  of  a  theory  of  semantic  markers,  some  such 
relation  would  be  expected,  since  the  semantic  features  of  the  words 
in  any  meaningful  sentence  are  interdependent.  The  predicate  is 
upholstered  imposes  certain  restrictions  on  the  semantic  markers 
of  its  subject,  and  only  words  that  have  those  required  features  can 
be  substituted  as  subject  without  violently  altering  the  acceptability 
of  the  sentence.  Couch ,  sofa ,  chair ,  etc.,  are  substitutable  in  the 
context  The  ....  is  upholstered,  and  are  similar  in  meaning,  where¬ 
as  sugar,  hate,  learn,  delicate,  rapidly,  etc.,  are  not.  If  the  method 
is  used  blindly,  of  course,  it  can  lead  to  absurd  results,  e.g.,  no  and 
elementary  are  not  similar  in  meaning  just  because  they  can  both  be 
substituted  into  the  frame  John  has  studied psychology. 

If  judges  are  asked  to  say  whether  or  not  two  items  are  substitut¬ 
able  in  a  given  context,  they  must  be  instructed  as  to  what  is  to 
remain  invariant  under  the  substitution.  Various  criteria  can  be 
applied:  grammaticality,  truth,  plausibility,  etc.  The  results  can 
be  very  different  with  different  criteria.  If  meaning  is  to  be  pre¬ 
served,  for  example,  only  rather  close  synonyms  will  be  acceptable, 
whereas  if  grammaticality  is  to  be  preserved,  a  large  set  of  words  be¬ 
longing  to  the  same  syntactic  category  will  usually  be  acceptable. 

Stefflre  has  used  distributional  techniques  to  obtain  measures  of 
semantic  similarity.  He  takes  a  particular  word  and  asks  people  to 
generate  a  large  number  of  sentences  using  it.  Then  he  asks  them 
to  substitute  another  word  for  the  original  one  in  each  of  the 
sentences  they  have  written.  Taking  the  sentences  as  contexts  and 
the  whole  set  of  substitutions  as  his  set  of  words  to  be  scaled,  he 
creates  a  context-by-word  matrix,  and  has  his  subjects  judge  whether 
every  context-word  pair  in  the  matrix  is  a  plausible  sentence.  Then 
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he  can  apply  an  intersection-union  ratio  to  either  the  sets  of  contexts 
shared  by  two  words,  or  to  the  sets  of  words  admissible  in  two 
contexts. 

In  England  Jones  has  proceeded  along  different  lines  toward  a 
similar  goal.  She  uses  the  Oxford  English  Dictionary  to  create  a 
list  of  synonyms,  or  near-synonyms,  for  every  word  sense,  then 
computes  an  intersection-union  ration  for  the  number  of  shared 
synonyms. 

A  number  of  workers  have  resorted  to  classification  methods  for 
particular  purposes,  but  until  recently  there  appears  to  have  been 
no  systematic  use  of  these  methods  to  explore  semantic  features.  At 
present  several  variations  of  the  general  method  are  in  use,  but 
almost  nothing  of  this  work  had  appeared  in  print  at  the  time  this 
paper  was  written.  In  order  to  illustrate  the  classification  method, 
therefore,  I  will  describe  a  version  of  it  that  Herbert  Rubenstein, 
Virginia  Teller,  and  I  have  been  using  at  Harvard. 

In  our  method,  the  items  to  be  classified  are  typed  on  file  cards 
and  a  judge  is  asked  to  sort  them  into  piles  on  the  basis  of  similarity 
and  meaning.  He  can  form  as  many  classes  as  he  wants,  and  any 
number  of  items  can  be  placed  in  each  class.  His  classification  is 
then  recorded  and  summarized  in  a  matrix,  as  indicated  in  Fig.  10, 
where  data  for  three  judges  classifying  eight  words  are  given  for 
illustrative  purposes.  A  judge’s  classification  is  tabulated  in  the 
matrix  as  if  he  had  considered  every  pair  independently  and  judged 
them  to  be  either  similar  (tabulate  1)  or  dissimilar  (tabulate  0).  For 
example,  the  first  judge,  Sj,  uses  five  classes  to  sort  these  eight  words; 
he  puts  "cow”  and  “tiger"  together,  “chair”  and  “rock”  together, 
and  “fear"  and  "virtue"  together,  but  leaves  “tree”  and  “mother” 
as  isolates.  In  the  data  matrix,  therefore,  this  judge’s  data  contribute 
one  tally  in  the  cow-tiger  ceil,  one  in  the  chair-rock  cell,  and  one  in 
the  fear-virtue  cell.  The  data  for  the  second  and  third  judge  are 
similarly  scored,  and  the  number  of  similarities  indicated  by  their 
classifications  are  similarly  tabulated  in  the  matrix.  Thus,  in  this 
example,  all  three  subjects  group  the  inanimate  objects  “chair"  and 
“rock”  together,  so  “3”  appears  in  that  cell;  two  subjects  group  the 
animate  organisms  "cow"  and  “mother"  together;  etc.  After  the 
classifications  of  several  judges  are  pooled  in  this  manner,  we  obtain 
a  data  matrix  that  can  be  interpreted  as  a  matrix  of  measures  of 
semantic  similarity;  our  assumption  is  that  the  more  similar  two 
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DATA 

S, :  ( COW ,  TIGER )  (CHAIR ,  ROCK ) (FEAR, VIRTUE) 
(TREEXMOTHER) 


S2:  (COW, MOTHER, TIGER  .TREE)  (CHAIR, ROCK) 
(FEAR, VIRTUE) 

S3:  (COW, MOTHER, TIGER) (CHAIR, ROCK, TREE) 
(FEAR, VIRTUE) 


MATRIX 


CHAIR 
COW 
FEAR 
MOTHER 
ROCK 
TIGER 
TREE 
VIRTUE 

Figure  10.  Illustration  of  how  the  classifications  given  by  three  judges 
would  be  tabulated  in  matrix  form  for  subsequent  analysis. 

items  are,  the  more  often  people  will  agree  in  classifying  them  to¬ 
gether.  In  our  experience,  judges  can  classify  as  many  as  100  items 
at  a  time,  and  as  few  as  20  judges  will  generally  suffice  to  give  at 
least  a  rough  indication  of  the  pattern  of  similarities. 

The  data  matrix  is  then  analyzed  by  a  procedure  described  and 
programmed  for  a  computer  by  S.  C.  Johnson  of  the  Bell  Telephone 
Laboratories.  The  general  principle  of  this  cluster  analysis  is  sug¬ 
gested  in  Fig.  11.  If  we  look  at  the  data  matrix  of  Fig.  10  for  those 
items  that  all  three  subjects  put  together,  then  we  have  the  five 
classes  shown  at  the  tips  of  the  tree  in  Fig.  11:  (cow,  tiger)  (mother) 
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CLUSTER  ANALYSIS 


Figure  11.  Illustration  of  a  cluster  analysis  of  the  data  tabulated  in  the 
matrix  of  Fig.  10. 

(tree)  (chair,  rock)  (fear,  virtue).  If  we  relax  our  definition  of  a  cluster 
to  mean  that  two  or  more  judges  agreed,  we  have  only  four  classes; 
“mother”  joins  with  “cow”  and  "tiger”  to  form  a  single  class.  And 
when  we  further  relax  our  definition  of  a  cluster  to  include  the 
judgment  of  only  one  person,  we  have  only  three  classes:  “tree” 
joins  “mother,”  “cow,"  and  "tiger.”  As  Johnson  points  out,  the  level 
of  the  node  connecting  two  branches  can  be  interpreted  as  a  meas¬ 
ure  of  their  similarity.  The  dotted  lines  at  the  top  of  Fig.  1 1  are 
meant  to  suggest  that  an  object-nonobject  dichotomy  might  have 
emerged  with  more  data,  but  on  the  basis  of  the  data  collected,  that 
must  remain  conjectural. 

Our  hope,  of  course,  is  that  clusters  obtained  by  this  routine  pro¬ 
cedure  will  bear  some  resemblance  to  the  kinds  of  taxonomic  struc¬ 
tures  various  theorists  have  proposed,  and  that  the  clusters  and  their 
branches  can  be  labelled  in  such  a  way  as  to  reflect  the  semantic 
markers  or  dimensions  involved.  Whether  this  hope  is  justified  can 
be  decided  only  by  examining  the  results. 

Semantic  clusters.  In  order  to  illustrate  the  kind  of  results  obtained 
with  this  classification  procedure,  consider  a  study  whose  results  are 
summarized  in  Fig.  12.  Forty-eight  nouns  were  selected  rather  ar¬ 
bitrarily  to  cover  a  broad  range  of  concepts,  subject  to  the  constraint 
that  half  of  them  should  be  names  of  objects  and  the  other  half 
should  be  names  of  nonobjects  (concepts).  This  important  semantic 
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marker  was  introduced  deliberately  in  order  to  see  whether  it  would 
be  detected  by  the  clustering  procedure;  if  so  important  a  semantic 
marker  would  not  come  through  clearly,  then  nothing  would. 


NOUNS 


Ftcimn  12.  Results  of  a  duster  analysis  of  48  nouns,  with  suggested  names 
for  the  dusters  indicated  in  parentheses. 
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Each  of  the  48  nouns  was  typed  on  a  3’'  x  5”  card,  along  with  a 
dictionary  definition  of  the  particular  sense  of  the  word  that  was 
intended  and  a  simple  sentence  using  the  word  in  that  sense.  The 
cards  were  classified  by  50  judges,  their  results  were  tabulated  in  a 
data  matrix,  and  cluster  analysis  was  performed  on  the  data  matrix 
in  order  to  determine  what  Johnson  calls  die  optimally  “compact” 
set  of  clusters.  The  five  major  clusters  that  were  obtaiued  are 
shown  in  Fig.  12,  where  hicy  are  named,  quite  intuitively,  “living 
objects,"  and  “nonliving  objects,”  “quantitative  concepts,”  "i>cr- 
sonal  concepts,"  and  “social  concepts.”  The  finer  structure  within 
each  of  these  clusters  is  also  diagrammed  in  Fig.  12.  For  example, 
the  tree  graph  shows  that  48  of  the  50  judges  put  “plant”  and 
“tree”  into  the  same  class;  that  42  or  more  judges  put  “plant,” 
“tree,”  and  “root”  into  the  same  class;  and  that  40  or  more  judges 
put  "plant,”  “tree,”  “roof,”  and  “hedge”  into  the  same  class. 

Did  the  semantic  marker  that  was  deliberately  introduced  into 
the  set  of  words  reappear  in  the  analysis?  Yes  and  no.  The  clusters 
obtained  did  not  contradict  the  hypothesis  that  our  judges  were 
sorting  with  this  semantic  distinction  in  mind,  ye:  their  data  indi¬ 
cate  a  finer  analysis  into  at  least  five,  rather  than  only  two  clusters, 
so  the  object  marker  is  not  completely  verified  by  these  data.  None¬ 
theless,  the  results  were  sufficiently  encouraging  that  we  continued 
to  study  the  method. 

The  48  nouns  listed  in  Fig.  12  were  also  chosen  to  have  the  tharuc- 
teristic  that  each  of  them  could  also  be  used  as  a  verb.  In  another 
study,  therefore,  the  verb  senses  of  these  words  were  defined  and  il¬ 
lustrated  on  the  cards  that  the  judges  were  asked  to  classify.  When 
they  are  thought  of  as  verbs,  of  course,  the  object-concept  distinction 
that  is  so  obvious  for  these  words  in  their  noun  usages  is  no  longer 
relevant;  the  object  marker  would  not  be  expected  to  appear  in  (be 
results  of  the  verb  classifications,  and  in  truth  it  did  not.  The  results 
of  the  verb  study  are  net  presented  here,  however,  because  I  do  not 
yet  understand  them.  The  object  marker  did  not  appear,  but 
neither  did  anything  else  that  1  could  recognize.  It  is  my  im|ne*sion 
that  judges  were  too  much  influenced  by  other  words  in  the  particu- 
lar  sentences  in  which  the  verbs  were  illustrated.  Perhaps  the  seman¬ 
tic  analysis  of  predicates  is  basically  different  from  the  analysis  of 
subjects;  perha|ts  verb*  signify  rather  sjiectal  fortmtlae-complex 
functions  into  which  particular  uouiu  tan  be  substituted  as  argu- 
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ments— and  the  classification  of  these  functions  is  more  complex, 
more  contingent,  more  difficult  than  the  classification  of  their  argu¬ 
ments.  This  is  a  deep  question  which  I  am  not  prepared  to  discuss 
here. 

There  is,  of  course,  an  important  syntactic  basis  for  classifying 
English  words,  ix\,  the  classification  into  parts  of  sjieech.  The  data 
in  Fig.  12  were  obtained  for  a  single  part  of  speech— nouns— and  so 
do  not  give  us  any  indication  of  the  relative  importance  of  syntactic 
categories.  Fig.  13,  however,  shows  some  resuits  obtained  by  Jeremy 
M.  Anglin  with  a  set  of  36  common  words  consisting  of  twelve 
nouns,  twelve  verbs,  six  adjectives,  and  six  adverbs.  Twenty  judges 
classified  these  couccpts  and  the  analysis  of  their  data  reveals  five 
major  clusters.  Four  of  these  clusters  reflect  the  syntactic  classifica¬ 
tion,  but  onc-‘‘sadly,”  “suffer,”  and  “weep”— combines  an  adverb 
with  two  verbs.  With  this  one  exception  however,  adult  judges 
seem  to  work  by  sorting  the  items  on  syntactic  grounds  before  sort¬ 
ing  them  on  semantic  grounds. 

It  is  important  to  notice,  however,  that  the  results  summarized  in 
Fig.  13  were  obtained  with  adult  judges.  Anglin  also  gave  the  same 
test  to  20  subjects  iu  the  3rd  aud  4th  grades,  to  20  in  the  7th  grade, 
and  to  20  more  in  the  llth  grade  (average  ages  about  8-5,  12,  and 
16  years,  reflectively).  The  clusters  obtained  from  die  youngest 
group  of  judges  are  shown  in  Fig.  14.  It  is  obvious  that  children  in¬ 
terpret  the  task  quite  differently.  When  asked  to  put  things  together 
that  are  simitar  in  meaning,  children  tend  to  put  together  words 
that  might  be  used  in  talking  about  the  same  thing-which  cuts 
right  across  the  tidy  syntactic  boundaries  so  important  to  adults. 
Thus  all  20  of  the  children  agree  in  putting  the  verb  “eat"  with  the 
noun  "apple";  for  many  of  them  "air”  is  "cold”;  the  "foot”  is  used 
to  "jump”;  you  “live”  in  a  "house”:  "sugar”  is  "sweet":  and  the 
cluster  of  "doctor."  "needle,”  "suffer,”  "weep,"  and  “sadly”  is  a 
small  vignette  in  itself. 

These  qualitative  differences  observed  in  Anglin's  study  serve  to 
confirm  developmental  trends  previously  established  on  the  basis  o( 
word  association  tests  with  children -an  excellent  discussion  of  this 
work  has  been  given  by  Doris  K.  Entwisle— where  it  is  found  that 
children  give  more  word  association  icq  muses  from  different  syntac¬ 
tic  classes  than  do  adults  This  trend  also  qqiear*  in  the  classifica¬ 
tion  data,  in  Fig,  13  some  particular  word  pairs  have  been  selected 
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Figure  13.  Cluster  analysis  of  36  words  by  adult  subjects.  Note  that  syn¬ 
tactic  categories  are  faithfully  respected.  Data  from  J.  M.  Anglin. 
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GRADES  3  AND  4 
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NUMBER  OF  SUBJECTS 

Ficimt  14.  The  same  36  words  of  Fig.  IS  were  classified  by  children  in  the 
3rd  and  4lh  grades.  Note  violation  of  syntactic  categories.  Data  from  J.  M. 
Anglin. 
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AGE  GROUPS 


Figure  15.  Graph  illustrating  some  developmental  trends  in  the  classifica¬ 
tion  data.  Data  from  J.  M.  Anglin. 

as  illustrating  most  clearly  the  changes  Anglin  observed  as  a  function 
of  age.  The  thematic  combination  of  words  from  different  parts  of 
speech,  which  is  generally  called  a  “syntagmatic”  response,  can  be 
seen  to  decline  progressively  with  age,  and  the  putting  together  of 
words  in  the  same  syntactic  category,  generally  called  a  "paradig¬ 
matic"  response,  increases  during  the  same  period. 

Although  there  is  no  basic  contradiction  between  results  obtained 
with  word  association  methods  and  with  word  classification  methods, 
some  aspects  of  the  subjective  lexicon  seem  to  be  displayed  more 
clearly  with  the  classification  procedures.  In  order  to  make  some 
comparison  between  the  two  methods,  we  took  word  association 
data  collected  by  Ocese  and  used  Johnson's  cluster  analysis  on 
them.  In  this  particular  study,  Dcese  used  the  word  "butterfly"  as  a 
stimulus  and  obtained  18  different  word  associations  from  50  under¬ 
graduates  at  Johns  Hopkins.  Then  he  used  these  18  responses  as 
stimuli  for  another  group  of  50  subjects.  He  then  had  response 
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distributions  for  19  closely  related  words,  so  that  he  was  able  to 
compute  intersection  coefficients  (a  particular  version  of  an  inter¬ 
section-union  ratio)  between  all  of  the  171  different  pairs  that  can 
be  formed  from  these  19  words.  Deese  published  the  intersection 
coefficients  in  a  matrix  whose  entries  could  be  interpreted  as  mea¬ 
sures  of  associative  similarity  between  words.  When  Johnson’s  clus¬ 
ter  analysis  was  carried  out  on  this  data  matrix,  the  results  showing 
in  Fig.  16  were  obtained. 
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INTERSECTION  COEFFICIENTS 

Ftci  te  16.  The  duster  analysis  procedure  developed  by  Johnson  was  ap¬ 
plied  to  Dcoe's  data  on  the  intersection  coefficients  for  word  associations. 

The  dusters  obtained  with  the  word  association  data  are  a  hit 
difficult  to  interpret.  If  we  ask  whether  these  clusters  preserve  syn¬ 
tactic  classes,  die  answer  depends  on  whether  we  consider  certain 
words  to  be  nouns  or  adjectives.  For  example,  "blue"  can  be  ttsed 
either  as  an  adjective  (as  in  the  phrase  “blue  sky")  or  as  a  noun  (as 
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in  the  phrase  “sky  blue”);  “flower”  would  normally  be  considered  ! 

as  either  a  noun  or  a  verb,  but  in  “flower  garden”  it  functions  as  an 
adjective;  “fly"  in  the  cluster  with  “bird”  and  “wing”  is  probably  a 
verb,*  but  it  might  be  a  noun;  etc.  One  could  also  argue  that  many 
of  the  words  have  multiple  meanings;  for  example,  some  of  the  sub¬ 
jects  who  associated  “spring”  and  “sunshine”  might  have  been 
thinking  of  “spring”  as  a  season  of  the  year  and  others  might  have 
meant  it  as  a  source  of  water.  In  short,  data  of  this  sort  are  useful 
when  words  are  to  be  dealt  with  in  isolation,  as  they  often  are  in 
verbal  learning  experiments,  but  they  do  not  contribute  the  in¬ 
formation  we  need  in  order  to  understand  how  word  meanings  work 
together  in  the  interpretation  of  sentences. 

For  purposes  of  comparison,  therefore,  we  appealed  to  a  lexico¬ 
grapher:  the  19  words  in  Deese’s  study  were  looked  up  in  a  child’s 
dictionary,  where  a  total  of  72  different  definitions  were  found. 

Each  definition  was  typed  onto  a  separate  card.,  along  with  the 
word  defined  and  a  sentence  illustrating  its  use.  J.  M.  Anglin  and 
Paul  Bogrow  tested  20  judges  with  these  72  items.  Their  results 
are  shown  in  Fig.  17.  Anglin  and  Bogrow  found  nine  major  clusters, 
which  are  quite  different  from  Deese’s  associative  clusters  and  much 
closer  to  the  requirements  of  a  semantic  theory.  For  example,  there 
were  twelve  senses  of  “spring.”  Ten  of  these  comprise  a  single 
cluster,  and  the  similarity  measure  suggests  how  a  lexical  entry  for 
these  ten  might  be  organized.  "Spring”  in  the  sense  of  a  source  of 
water  did  not  fit  into  this  cluster,  nor  did  "spring”  in  the  seasonal 
sense;  those  two  senses  would  have  to  have  separate  entries.  Whereas 
the  preceding  studies  illustrated  the  use  of  the  classification  for  wide¬ 
ly  different  concepts,  this  one  indicates  that  the  method  might  also 
be  useful  for  investigating  the  finer  details  of  closely  related  mean-  \ 

ings.  1 

One  final  example  of  this  method  may  be  of  interest.  As  men-  * 

tioned  previously,  Osgood  and  his  coworkers  have  made  extensive  i 

use  of  rating  scales  defined  by  antonymous  adjectives  in  order  to  $ 

define  a  coordinate  system  in  which  meanings  can  be  characterized  J 

by  their  spatial  position.  We  decided,  therefore,  to  use  antonymous  jjj 

pairs  of  adjectives  in  a  classification  study.  One  hundred  of  the  ad-  % 

jective  pairs  Osgood  had  used  were  selected  and  typed  on  cards—  f 

this  time  without  definitions  or  examples,  since  the  antonymous  re-  } 

lations  left  little  room  for  ambiguity-and  20  judges  were  asked  to  j 
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Figure  17.  Seventy-two  senses  of  the  nineteen  words  studied  by  Deese  (see 
Fig.  16)  were  classified  and  a  cluster  analysis  of  the  classification  data  was 
performed. 
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classify  them.  The  results  for  41  of  these  100  pairs  are  shown  in 
Fig.  18  to  illustrate  what  happened.  (Data  for  the  other  59  pairs  is 
analogous,  but  limitations  of  space  dictate  their  omission.) 

Osgood  finds  rather  consistently  that  the  most  important  dimen¬ 
sion  in  his  semantic  differential  is  the  good-bad,  or  evaluative 
dimension.  Most  of  the  antonymous  pairs  that  were  heavily  loaded 
on  Osgood’s  evaluative  dimension  turned  up  in  our  cluster  analysis 
in  the  three  clusters  shown  in  the  lower  half  of  Fig.  18.  Inspection 
of  these  three  clusters  suggests  to  me  that  our  judges  were  distin¬ 
guishing  three  different  varieties  of  evaluation  which,  for  lack  of 
better  terms  might  be  called  moral,  intellectual,  and  esthetic.  To 
the  extent  that  Osgood’s  method  fails  to  distinguish  among  these 
varieties  of  evaluation,  it  must  be  lacking  in  differential  sensitivity. 

Fig.  18  also  presents  a  large  cluster  of  adjectives  that,  according  to 
the  introspective  reports  of  some  judges,  might  be  considered  as 
going  together  because  “they  can  all  be  used  to  describe  people.” 
It  is  not  easy  to  know  what  this  characterization  means,  since  al¬ 
most  any  adjective  can  be  used  to  describe  someone,  but  perhaps  it 
points  in  a  suggestive  direction.  It  should  be  noted,  however,  that 
this  characterization  is  not  given  in  terms  of  similarities  of  mean¬ 
ings  among  the  adjectives,  but  rather  in  terms  of  similarities  among 
the  words  they  can  modify.  Once  again,  therefore,  we  stumble  over 
this  notion  that  the  nouns  may  have  a  relatively  stable  semantic 
character,  but  the  words  that  go  with  them,  the  adjectives  and 
verbs,  are  much  more  dependent  on  context  for  their  classification. 

There  are  still  difficulties  that  must  be  overcome  before  the  clas¬ 
sification  method  can  be  generally  useful.  Some  way  must  be  found 
to  work  with  more  than  100  meanings  at  a  time.  Some  way  should 
be  sought  to  locate  generic  words  at  branching  points.  Effects  of 
context— both  of  the  sentence  in  which  the  meaning  is  exemplified, 
and  also  of  the  context  provided  by  the  other  words  in  the  set  to  be 
classified— must  be  evaluated.  Relations  of  cluster  analysis  to  factor 
analysis  need  to  be  better  understood,  and  so  on  and  on  down  a 
catalogue  of  chores.  But  the  general  impression  we  have  formed 
after  using  the  classification  method  is  that,  while  it  is  certainly  not 
perfect,  it  seems  to  offer  more  promise  for  semantic  theory  than  any 
of  the  other  techniques  psychologists  have  used  to  probe  the  struc¬ 
ture  of  the  subjective  lexicon. 
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Figure  18.  One  hundred  pairs  of  antonymous  adjectives  taken  from  Os¬ 
good,  Suci,  and  Tannenbaum  were  sorted  by  20  judges;  41  of  the  pairs  are 
shown  here  as  they  clustered  under  the  classification  procedure.  Note  that 
the  good-bad  dimension  so  important  in  the  semantic  differential  of  Osgood 
is  here  analyzed  into  three  separate  clusters. 
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SUMMARY 

A  too  short  summary  of  this  paper  might  be  that  language  is 
what  it  is  because  we  use  it  to  say  things.  The  capacity  to  say  some- 
thing~to  affirm  or  deny  some  comment  about  some  topic— may  not 
be  uniquely  limited  to  man,  but  certainly  he  is  better  at  it  than  any 
other  animal.  Saying  something  requires  that  we  have  certain  gram¬ 
matical  machinery  in  our  languages,  ways  to  combine  topics  and 
comments,  ways  to  make  one  sentence  the  topic  for  another  com¬ 
ment.  Saying  something  also  requires  that  we  have  certain  semantic 
machinery,  so  that  what  we  say  can  be  interpreted  by  our  listener 
on  the  basis  of  what  he  knows  about  the  meanings  of  its  constitu¬ 
ent  parts.  These  are  problems  that  linguists  and  psychologists  share, 
and  that  form  the  kernel  of  the  young  science  called  psycholinguis¬ 
tics.  If  and  when  we  are  able  to  achieve  a  deeper  understanding  of 
what  men  do  when  they  say  something,  we  should  be  able  to  use 
that  understanding  to  improve  the  communication  of  meaning. 
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III.  Processes  of  Interpersonal 
Accommodation 

Harold  H.  Kelley 

This  paper  describes  some  o£  my  research  on  the  processes  by  which 
interdependent  persons  make  accommodations  to  one  another. 
Social  interdependence  refers  to  the  fact  that  in  most  interpersonal 
relationships,  the  satisfaction  of  each  person's  needs  is  dependent  in 
some  manner  upon  the  actions  of  the  other  persons.  Social  inter¬ 
dependence  is,  of  course,  a  pervasive  characteristic  of  human  life.  It 
ranges  from  the  temporary  but  severe  interdependence  a  driver  en¬ 
dures  on  the  highway  in  his  relations  with  other  motorists,  to  the 
equally  serious  but  more  permanent  interdependence  that  economic 
circumstances,  laws  and  social  customs,  and  emotional  attachments 
impose  upon  married  couples.  We  are  interdependent  with  other 
persons  in  our  solution  of  common  external  problems,  in  our  striv¬ 
ing  to  gain  and  maintain  social  status,  and  in  our  coping  with  per¬ 
sonal  needs  and  anxieties.  This  is  merely  to  say  that  there  is  ample 
justification  for  the  careful  analysis  of  social  interdependence,  if  one 
can  ever  justify  studying  a  phenomenon  by  reference  to  its  common 
occurrence. 

Let  me  first  briefly  describe  my  general  approach  to  this  research 
area.  The  particular  type  of  interdependence  that  characterizes  a 
relationship  is  viewed  as  posing  one  or  more  problems  for  the 
participants  to  solve.  If  they  are  able  to  do  so,  their  relationship 
will  be  a  satisfying  and  viable  one.  If  not,  it  will  be  less  rewarding 
than  it  might  be  and  may  even  be  so  unstable  as  eventually  to 
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disrupt.  The  solutions  to  interdependence  problems  consist  of  those 
patterns  and  routines  of  interaction  which  insure  adequate  satisfac¬ 
tion  of  each  participant’s  needs. 

Two  very  simple  examples  will  serve  to  illustrate  this  point  of 
view.  Example  I,  in  Figure  1,  shows  a  case  where  two  persons  are 
totally  in  control  of  each  other’s  fates  They  have  mutual  fate  con¬ 
trol.*  One  convenient  way  to  represent  certain  aspects  of  inter¬ 
dependence  is  by  means  of  a  payoff  matrix  such  as  is  used  in  game 
theory.  The  matrix  describes,  so  to  speak,  the  problem  confronting 

INTERDEPENDENCE  PROBLEM  ACCOMMODATIVE  SOLUTION 

PERSONA 
LEFT  RIGHT 


LEFT 

PERSON  B  * 

RIGHT 

Figure  1.  Example  1  Mutual  Fate  Control. 

the  participants.  In  Example  I,  each  of  the  two  jtersons,  A  and  B, 
has  only  two  possible  responses.  The  four  cells  in  the  body  of  the 
matrix  show  the  four  possible  combinations  of  their  respective  ac¬ 
tions  and  the  various  consequences  for  each  |terson.  The  upper 
right  portion  of  each  cell  shows  the  consequences  for  person  A  and 
the  lower  left  |>ortion,  the  consequences  for  B.  “Plus’’  means  a 
favorable  outcome,  “minus,”  an  unfavorable  one,  and  in  Figure  2, 
’’zero’’  means  a  neutral  outcome. 

The  matrix  for  Example  1  shows  that  each  |>erson  can  provide 
rewards  to  the  other  ivithout  any  effect  upon  his  men  outcomes,  and 
that  rewards  can  be  provided  to  both  |>ersons  at  once.  1’his  is  ob¬ 
viously  a  simple  interdependence  problem  and  the  indicated  accom¬ 
modative  solution  is  equally  simple,  namely,  an  exchange  or  trade  of 
“left"  responses.  In  a  real-life  instance,  “left"  might  mean  paying 
the  other  |>erson  a  compliment,  and  “right"  might  mean  criticizing 

*  The  terntt  for  the  relationships  and  tauth  of  the  analysis  here  presented  is 
derived  front  Thibaut  and  Kelley  (1959). 
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him.  A  mutual  accommodation  would  require  merely  exchanging 
compliments  rather  than  expressing  criticisms. 

Another  simple  example  is  shown  in  Example  II,  Figure  2.  This 
matrix  represents  a  case  of  mutual  behavior  control  which  gives  the 
pair  joint  control  over  their  respective  outcomes.  A  relation  of  this 
sort  might  occur  when,  for  some  reason,  similar  responses  are  incom¬ 
patible  but  opjx)site  responses  are  complementary  and  rewarding  to 
the  person  who  makes  the  left  response.  An  important  feature  of  the 
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PERSON  A 
LEFT  RIGHT 

LOT 

PERSONS 
RIGHT 

Figure  2.  Example  H  Mutual  Bella vior  Control. 

relationship  is  that  only  one  person  at  a  time  can  gain  his  desired 
goal.  What  is  indicated  as  the  accommodative  solution  to  this  prob¬ 
lem  is  a  turn-taking  procedure  in  which  the  two  jhisohs  alternate 
between  the  left-right  and  right-left  combinations  of  responses.  In  a 
natural  situation,  “left"  might  mean  being  fust  to  use  the  bathroom 
in  the  morning  (which  enables  that  person  to  raith  his  bus  without 
undue  exertion)  and  "right"  might  mean  using  it  second  with  con¬ 
sequent  loss  of  time-  Or,  “left"  might  mean  listening  and  learning 
something  from  the  other  person  and  "right,"  talking  and  telling 
the  other  something. 

These  are  only  two  of  many  (rosstble  examples.  With  slight  varia¬ 
tions,  the  mutual  behavior  control  relationship  in  Example  II  be¬ 
comes  a  pure  coordination  problem  rather  than  a  turn-taking  one. 
In  a  moment,  we'll  see  a  relationship  which  combines  features  ol 
these  two  examples;  one  jierson  Ira*  fate  control  ami  the  other  has 
behavior  control. 

One  of  the  relationship*  that  lias  been  subjected  to  much  hives- 
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tigation  in  recent  years  (Rapoport  and  Chammah,  1965)  is  known 
as  the  Prisoner’s  Dilemma.  This  relationship  can  be  formed,  as 
shown  in  Figure  3,  by  superimposing  upon  the  mutual  fate  con¬ 
trol  pattern  of  Example  I  an  appropriate  degree  of.  control  by  each 
person  over  his  own  outcomes,  which  motivates  him  to  make  the 
non-accommodativc  response.  The  story  diat  goes  with  this  relation¬ 
ship  concerns  two  prisoners  being  held  as  sus{iectcd  accomplices  in  a 
given  crime.  The  district  attorney  has  a  weak  case  against  them  and 
they  all  know  it,  but  he  has  cleverly  placed  them  in  separate  cells 
out  of  contact  with  each  other.  They  bodi  know  die  facts  sum- 
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Fiuurk  3.  Buses  of  the  Prisoners  Dilemma  Relationship 


maimd  in  the  left  matrix:  that  is,  if  they  maintain  silence  (left 
res|>onsc)  they  both  will  go  free  (indicated  by  the  -f  2's  in  the  upper 
left),  if  one  confesses  (right),  he  will  be  released  (-f-2)  anti  the  oilier 
one  will  go  to  jail  (—2),  and  if  both  confess,  both  will  go  to  jail 
(—2),  I  bis  mutual  fate  control  relationship  would  create  no  prob¬ 
lem  for  the  prisoners  (they  would  maintain  silence),  but  the  wily 
district  attorney  offers  each  one  a  special  reward  (+1  in  the  next 
matrix)  for  confessing:  for  example,  he  promises  not  to  press  some 
other  charges  for  which  he  has  managed  to  unearth  evidence.  To¬ 
gether  these  circumstances  create  the  relationship  at  the  right,  which 
is  one  example  of  what  is  known  as  the  Prisoner's  Dilemma.  The 
poignancy  of  this  relationship,  the  dilemma  it  creates  for  die  prison¬ 
ers,  is  that  the  mutually  desirable  solution  is  obvious  (in  the  upper 
left  cell,  4-2‘s  for  both)  hut  *s  not  attainable  under  these  conditions 
of  separate  decisions  without  opportunity  for  communication.  As 
each  jnisooer  analyzed  tire  logic  ol  the  situation  in  the  light  of  his 
at t*«  interests,  he  is  led  to  make  the  "right"  ot  confession  choice, 
even  as  lie  realizes  that  the  other  jterson  is  probably  making  the  same 
choice  and  dial  they  are  being  drawn  ineluctably  away  from  tire 
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appropriate  accommodation  to  a  solution  that  is  ruinous  for  both. 
The  Prisoner’s  Dilemma  -elationship  illustrates  one  of  the  many 
complexities  that  can  be  introduced  even  into  the  simple  two-by- 
two  matrix.  Anatoi  Rapoport,  a  mathematical  behavioral  scientist 
at  the  University  of  Michigan,  and  his  colleague  Melvin  Guyer 
(1966)  have  attempted  to  make  a  taxonomy  of  iwo-by-two  games 
for  the  simple  case  where  each  person’s  four  possible  payoffs  arc 
simply  rank  ordered  from  best  to  worst.  They  identify  78  games 
that  are  non-identical  (no  one  of  the  78  can  be  derived  from  another 
merely  by  interchanging  rows,  columns,  players,  o:  some  combina¬ 
tion  of  these)  and  they  find  that  the  78  can  be  classified  into  10 
categories.  In  addition  to  the  Prisoner’s  Dilemma,  other  types  of 
games  in  their  taxonomy  have  become  identified  by  such  titles  as 
“Chicken”  or  “Brinksmanship,”  “Let  George  do  it”  or  “Hero,” 
“The  Battle  of  the  Sexes,”  and  “Inspector-evader.”  The  point  of 
all  this  is  that  even  with  a  simple  two-by-two  paradigm,  a  great 
variety  of  social  relationships  can  be  defined. 

In  addition,  of  course,  any  given  relationship  can  be  expanded  to 
permit  of  more  responses  and  graded  variations  in  outcomes  as,  for 
example,  Pilisuk  and  Rapoport  (1964)  have  done  for  the  Prisoners’ 
Dilemma  game.  It  is  also  clear  that  for  some  of  the  interdependence 
problems  we  can  specify,  the  accommodative  solutions  are  not  as 
easily  identifiable  r  •  the  simple  examples  I  have  given.  Prob¬ 
lems  arise  in  this  i  .^ot  when  the  relationship  is  such  that  the 
generally  accepted  criteria  for  what  constitutes  a  “good”  accom¬ 
modation  come  into  conflict  (c.g.,  the  notion  of  maximizing  joint 
outcomes  may  conflict  with  the  concept  of  equality  of  outcomes). 

In  any  case,  my  two  simple  examples  illustrate  the  general  ap¬ 
proach  to  this  area  of  investigation.  Ir.  each  instance,  we  have  a 
statement  of  the  interdependence  problem  and  a  statement  of  the 
accommodative  solution.  The  problem  consists  of  the  particular 
type  of  interdependence  that  characterizes  the  relationship.  The 
solution  consists  of  an  interaction  pattern  or  routine  that  enables 
the  participants  to  enjoy  the  rewards  and  tc  minimize  the  costs  in¬ 
herent  in  their  relationship.  Given  this  perspective,  the  general 
goals  of  the  research  are  (1)  analysis  of  the  major  types  of  social 
interdependence  and  of  the  interaction  routines  that  provide  ac¬ 
commodations  to  the  problems  they  pose,  and  (2)  (and  most  im- 
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portantly)  investigation  of  the  processes  by  which  accommodations 
are  attained. 

Now,  let  me  outline  the  research  performed  within  this  frame¬ 
work.  The  three  types  of  experiments  I  want  to  describe  deal  with 
two  types  of  interdependence  problems:  (1)  common  interest  prob¬ 
lems  and  (2)  conflict  of  interest  problems.  The  common  interest 
problems  have  been  studied  under  conditions  of  minimal  informa¬ 
tion  and  minimal  communication.  The  conflict  of  interest  problems 
have  been  studied  (a)  under  conditions  of  full  information  but 
partial  communication,  and  (b)  under  conditions  of  partial  infor¬ 
mation  but  full  communication.  The  reasons  for  my  interest  in 
these  particular  information  and  communication  conditions  will 
become  clear  as  we  proceed. 

First,  then,  let  us  consider  common  interest  problems.  These  are 
relationships  characterized  by  the  existence  of  one  or  more  mutually 
preferred  cells  in  the  interdependence  matrix.  The  first  situation 
we  have  studied  is  that  shown  in  our  first  example,  Figure  1,  which 
is  the  case  of  mutual  fate  control.  It  seems  obvious  that  persons  re¬ 
lated  in  this  manner  can  achieve  a  stable,  mutually  satisfactory  ac¬ 
commodation— that  they  can  agree  upon  the  exchange  of  “lefts"—!/ 
they  have  full  information  about  their  relationship  and  are  able  to 
communicate  about  it,  and  if  they  are  motivated  solely  as  indicated 
in  the  matrix.  It  is  less  clear,  however,  whether  the  appropriate  ex¬ 
change  can  be  worked  out  without  the  participants’  awareness  of 
their  interdependence  and  loithout  communication.  The  original 
work  on  the  minimal  social  situation  by  Sidowski  and  his  colleagues 
(1956),  raised  the  question  as  to  whether  the  accommodation  can  be 
achieved  with  minimal  information  and  minimal  communication. 
This  is  an  important  question.  Inadequate  understanding  and  poor 
communication  are  undoubtedly  common  enough  in  the  real  world, 
that  knowledge  about  how  stable  arrangements  promoting  tthe 
common  welfare  can  evolve  despite  such  handicaps,  might  prove 
to  be  of  considerable  value. 

The  experimental  setting  is  schematically  shown  in  Figure  -1. 
Two  subjects,  A  and  B,  are  seated  at  tables  in  two  separate  rooms. 
They  have  been  brought  to  the  laboratory  independently  and  they 
have  absolutely  no  knowledge  of  one  another’s  existence,  bach  is 
shown  that  he  can  receive  points  (as  indicated  on  a  counter)  and  he 
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ROOM  1  ROOM  2 

Figure  4.  Schematic  Diagram  of  Minimal  Social  Situation. 


is  told  to  try  to  earn  as  many  points  as  possible.  He  >s  also  shown 
that  he  can  receive  electric  shocks  by  way  of  electrodes  attached  to 
his  arm.  One  does  not  need  to  tell  him  to  try  to  avoid  receiving  the 
shocks.  A  “Start”  signal  is  given  and  the  subjects  begin  making  their 
responses.  The  subjects  do  not  know  it,  but  the  buttons  are  con¬ 
nected  as  shown  in  the  diagram.  Each  one’s  left  button  gives  the 
other  person  a  point  and  each  one’s  right  gives  the  other  one  a 
shock.  The  question  is,  can  they  learn  to  give  points  and  to  avoid 
giving  shocks  without  having  any  other  information  or  opportun¬ 
ities  for  communication? 

The  mutual  accommodation  can  evolve  unde ;  these  minimal  cir¬ 
cumstances.  Thibaut,  Radloff,  Mundy  and  I  (1962)  were  able  to 
show  that  by  following  a  simple  trial  and  error  formula,  individuals 
are  able,  at  least  under  certain  conditions,  to  solve  the  mutual  fate 
control  problem  without  either  awareness  of  their  relationship  or 
explicit  communication.  To  illustrate,  let  us  assume  that  when  each 
person  makes  a  response  and  then  receives  a  positive  outcome,  he 
repeats  that  response  and  receives  a  negative  outcome,  he  changes 
to  the  other  response  the  next  time.  In  other  words,  assume  he 
follows  the  simple  reinforcement  pattern  that  has  been  referred  to  as 
a  "win-stay,  lose-change"  strategy.  Finally,  assume  that  the  two  make 
their  responses  simultaneously.  If  one  pursues  the  implications  of 
these  assumptions,  as  in  Figure  5,  one  sees  that  the  sequence  of 
events  should  follow  the  pattern  of  arrows  shown  there.  For  ex¬ 
ample,  if  on  the  first  trial,  A  goes  left  and  B  goes  right,  on  the  next 
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Figure  5.  Sequence  of  Events  with  Simultaneous  Responding. 

trial,  both  should  go  right,  the  reason  being  that  B,  receiving  a 
positive  outcome  repeats  his  right  response,  but  A,  receiving  a  neg¬ 
ative  outcome,  changes  from  left  to  right.  Then  on  the  following 
trial,  inasmuch  as  both  have  just  received  negative  outcomes,  both 
should  change  and  go  left.  When  they  do  so  and  reach  the  upper 
left  cell  which  yields  positive  consequences  for  both,  they  should  con¬ 
tinue  with  left  responses  indefinitely.  Our  data  show  that  this  is 
essentially  what  happens.  By  a  simple  trial  and  error  process,  they 
are  able  to  learn  to  accommodate  to  each  other’s  interests  by  eventu¬ 
ally  making  mainly  left  responses.  And  this  is  possible  without  their 
understanding  the  relationship  and  without  communication. 

In  the  foregoing,  I  have  assumed  that  the  two  persons  respond 
simultaneously.  Our  researches  have  further  shown  that  the  accomo¬ 
dation  does  not  occur  if  the  subjects  respond  in  an  alternating 
sequence. 

This  can  be  illustrated  by  Figure  6.  Assume,  for  example,  that 
B  has  made  his  left  response  and  A,  his  right  one.  Consequently 
they  are  in  the  upper  right  cell.  Inasmuch  as  B  has  received  a  nega¬ 
tive  outcome,  he  will  change  fiom  left  to  right.  Person  A  now  ex¬ 
periences  a  negative  outcome  so  in  turn,  he  changes  from  his  right 
to  his  left.  This  carries  them  to  the  lower  left  cell.  It  is  A’s  turn  to 
respond  but  because  he  is  now  receiving  a  positive  outcome,  he 
will  persist  and  repeat  his  last  response  which  was  left.  It  then  be- 
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RIGHT 


Figure  6,  Sequence  of  Events  with  Alternating  Responding. 

comes  A’s  turn  again  and  experiencing  a  negative  outcome,  he  will 
shift  back  to  his  right  response.  This  creates  a  negative  outcome  for 
B  so  that,  when  it  subsequently  becomes  his  turn,  he  too  changes. 
And  so  it  continues,  cycling  back  and  forth  from  upper  right  to 
lower  left,  each  time  passing  through  the  lower  right  cell.  Unless 
the  pair  is  lucky  enough  to  get  in  the  upper  left  cell  by  accident 
they  will  not  solve  the  problem.  They  cannot  systematically  work 
their  way  to  that  cell.  Our  experimental  data  are  consistent  with  this 
analysis.  With  an  alternation  sequence  of  responding,  the  accomo- 
dative  solution  is  rarely  attained  for  this  mutual  fate  control  rela¬ 
tionship. 

The  second  interdependence  problem  we  have  studied  under 
these  minimal  social  conditions  is  the  mixed  fate  control  and  be¬ 
havior  control  relationship  shown  in  Figure  7.  Person  A  has  fate 
control  over  B  and  B  has  behavior  control  over  A.  I  cannot  go  into 
the  matter  in  detail  but  Rabinowitz,  Rosenblatt  and  I  (1966)  have 
obtained  results  from  this  relationship  that  are  highly  consistent 
with  the  reasoning  I  have  just  outlined.  Under  conditions  of  simul¬ 
taneous  responding,  accommodation  is  more  difficult  to  achieve  in 
this  relationship  than  in  the  case  of  mutual  fate  control.  (This  is 
something  the  reader  can  prove  for  himself  by  applying  the  win-stay, 
lose-change  rule  to  this  matrix.)  However,  uuder  conditions  of  ad 
lib  responding,  when  subjects  are  free  to  change  responses  whenever 
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Figure  7.  Relationship  of  Fate  Control  and  Behavior  Control. 

they  wish,  the  results  are  reversed.  Accommodation  is  more  depend¬ 
ably  achieved  in  the  fate  control-behavior  control  relationship  than 
in  the  mutual  fate  control  relationship. 

These  results  are  summarized  by  the  table  in  Figure  8,  where  I 
have  indicated  the  quality  of  accommodation  achieved  in  the  two 
types  of  relationship  (mutual  fate  control  and  fate  control-behavior 
control)  under  three  condtions  of  response  evocation  (sinr  .ltaneous, 
alternating,  and  ad  lib).  "Good”  means  that  the  accommodation 


TIMING  CONDITIONS 


SIM  ALT  AD  LIB 


TYPE  OF 

INTERDEPENDENCE 

PROBLEM 


MFC 


FC-BC 


GOOD 

POOR 

POOR 

POOR 

? 

GOOD 

Figure  8.  Summary  of  Experimental  Results  from  the  Minimal  Social 
Situation. 


84  JOURNEYS  IN  SCIENCE 


occurs  successfully  under  minimal  social  conditions  and  “poor” 
means  that  it  does  not  or  only  to  a  slight  degree.  (The  question 
mark  in  the  FC-BC  row  under  the  alternating  response  condition 
means  that  we  have  no  data  on  that  case  as  yet.  We  would  expect 
good  accommodation  under  those  conditions.) 

The  point  I  wish  to  emphasize  by  reference  to  this  table  is  the 
importance  of  timing  factors  in  the  achievement  of  accommodation. 
We  know  from  a  variety  of  studies  that  the  type  of  interdependence, 
information,  communication  and  incentive  are  four  factors  govern¬ 
ing  the  achievement  of  accommodation.  Now  we  must  add  a  fifth 
factor  to  our  list,  which  might  be  called  process  constraints.  These 
constraints  govern  the  temporal  structure  of  the  interaction,  the 
distribution  of  the  process  over  time,  and  the  order  and  timing  of 
responses.  The  process  constraints  include  external  constraints 
(such  as  those  imposed  by  the  experimental  procedure  or  by  the 
calendar  or  clock)  and  internal  constraints  (such  as  are  reflected  in 
natural  variations  in  response  times  over  different  situations  and 
different  individuals).  Our  experimental  variations  of  these  con¬ 
straints  show  that  under  the  minimal  social  conditions,  they  interact 
with  the  type  of  interdependence  to  determine  whether  or  not 
accommodation  will  be  achieved. 

Anecdotes  from  natural  relationships  also  suggest  that  timing  of 
actions  is  often  a  controlling  factor  in  the  course  of  a  relationship. 
Quarreling  lovers  must  feel  like  ‘‘making  up”  at  the  same  time; 
nations  lose  an  opportunity  to  resolve  a  dispute  because,  when  one 
is  willing  to  negotiate,  internal  political  events  make  it  impossible 
for  the  other  to  do  so;  a  management  offer  that  the  union  could 
have  accepted  yesterday  becomes  unacceptable  today.  These  ex¬ 
amples  and  our  experimental  results  make  it  clear  that  the  “struc¬ 
ture”  of  time,  the  distribution  of  the  process  over  time,  is  worthy 
of  special  analysis  in  its  own  right. 

The  main  point  of  these  studies,  of  course,  is  that  they  illustrate 
the  achievement  of  accommodation  by  means  of  very  simple  process¬ 
es.  The  accommodation  occurs  without  the  participants’  knowledge 
of  their  relationship,  without  communication,  and  without  their 
being  oriented  toward  cooperation,  each  being  solely  concerned 
with  his  own  interests.  I  should  note  here  that  similar  processes 
have  been  identified  by  Seymour  Rosenberg  (1953)  in  his  work  on 
contaminated  feedback  and  by  Christie,  Luce  and  Macy  (1952)  who 
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observed  what  they  termed  “locally  rational”  processes  in  the  impli¬ 
cit  development  of  organization  in  communication  networks. 

The  existence  of  these  processes  in  the  laboratory  may  have  im¬ 
plications  for  interpersonal  accommodations  in  natural  settings.  For 
example,  our  results  with  the  mutual  fate  control  relationship 
(Figure  1)  have  possible  implications  for  such  phenomena  as  the 
implicit  development  of  collusion  between  sellers  providing  a  given 
product  to  the  same  population  of  customers.  The  movement  into 
the  left-left  cell  corresponds  to  their  both  setting  high  prices  for 
their  product.  The  results  from  the  fate  control-behavior  control 
relationship  (Figure  7)  have  possible  implications  for  mutual  accom¬ 
modation  in  such  interdependence  relationships  as  exist  between 
leaders  and  their  followers.  The  movement  into  the  left-left  cell 
corresponds  to  a  leader’s  providing  wise  leadership  in  exchange  for 
the  support  of  his  followers. 

However,  the  existence  of  these  processes  in  natural  settings— 
whether  they  are  frequent  or  rare,  and  where  they  occur,  if  at  all— is 
likely  to  be  difficult  to  ascertain.  Because  they  require  neither 
understanding  of  the  relationship  nor  explicit  communication  about 
it,  their  presence  would  not  ordinarily  be  known  to  the  persons 
involved.  Therefore,  information  about  these  processes  is  not  likely 
to  be  available  in  subjective  or  introspective  reports  about  social 
relationships.  Furthermore,  their  direct  identification  in  the  midst 
of  numerous  other  more  complex  processes  is  problematical.  In  this 
latter  regard,  we  have  found  evidence  of  a  simple  accommodative 
process  of  this  sort,  co-existing  with  “higher  level”  processes,  in 
a  much  more  complex  experimental  relationship.  This  process  was 
observed  in  studies  of  bargaining  in  the  “bilateral  monopoly"  rela¬ 
tionship,  following  the  research  of  Siegel  and  Fouraker  (1960).  Diet- 
mar  Schenitzki’s  careful  analysis  of  the  bargaining  process  (reported 
in  Kelley,  1964)  revealed  that,  even  as  the  bargainers  were  explicitly 
communicating  about  the  negotiation  problem,  the  pattern  of  their 
successive  offers  and  concessions  enabled  them  implicitly  to  attain 
agreements  that  maximized  their  joint  profits  (which  is  one  criterion 
of  successful  accommodation),  This  occurred  while  the  participants 
pursued  their  conflicting  interests  and  without  their  awareness  of 
or  explicit  communication  about  this  common  interest  aspect  of 
their  relationship.  (Schenitzki’s  work  will  be  described  in  more  de¬ 
tail  later  in  this  paper.) 
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Now,  consider  the  conflict  of  interest  problems.  In  the  present 
context,  these  consist,  not  of  perfectly  competitive  relations  (such 
as  zero-sum  or  constant-sum  games),  but  of  relationships  in  which 
there  is  some  conflict  of  interest  component— some  disagreement  as  to 
the  most  desirable  accommodation  point.  This  type  of  relationship 
is  illustrated  by  the  second  example  (Figure  2)  which  portrays  a 
relationship  in  which  the  two  persons  gain  their  respective  highest 
outcomes  in  different  cells  of  the  matrix.  This  is  what  has  been 
described  as  a  mixed  motive  relationship:  the  parties  have  a 
common  interest  in  avoiding  the  mutually  negative  cells,  but  a 
conflict  of  interest  over  which  of  the  other  two  cells  to  settle  on. 

Once  again,  it  seems  reasonable  to  assume  that  with  full  informa¬ 
tion,  full  communication  and  repeated  occasions  for  interaction, 
persons  who  are  motivated  as  indicated  in  the  matrix  would  have 
no  difficulty  in  achieving  an  accommodation.  Either  the  logical 
requirements  of  the  relationship  would  prevail  ^and  suggest  a  turn¬ 
taking  procedure)  or,  more  likely,  the  appropriate  rule  or  norm, 
learned  as  an  effective  means  of  accommodation  in  other  similar  re¬ 
lationships,  would  be  obvious  as  a  solution  to  this  problem.  For  this 
reason,  research  on  this  problem,  similar  to  that  on  common  interest 
problems,  has  focussed  on  the  achievement  of  accommodation 
while  under  some  handicap,  eithe>'  of  partial  information  or  of 
partial  communication. 

Our  work  on  conflict  of  interest  problems  has  dealt  first  with 
instances  of  full  information  but  limited  communication.  Full  in¬ 
formation  means  that  both  persons  are  fully  aware  of  the  payoft 
matrix  and  know  each  other’s  outcomes  at  every  point  in  the  inter¬ 
action.  By  limited  communication  is  meant  that  they  may  use  only 
the  actions  »on  the  matrix  as  a  means  cf  transmitting  information 
to  each  other.  They  do  not  have  the  usual  means  of  verbal  com¬ 
munication  and  must  rely  on  tacit  communication,  that  is,  communi¬ 
cation  by  actions  or  moves  within  the  game. 

The  research  on  conflict  of  interest  under  full  information  and 
limited  communication  has  dealt  with  (1)  interdependent  avoidance 
or  escape  and  (2)  interdependent  approach.  In  the  first,  the  inter¬ 
dependence  arises  from  the  fact  that  the  two  persons  must  use  a 
limited  facility  to  avoid  an  impending  danger,  and  in  the  second 
it  arises  from  the  necessity  of  their  using  a  limited  route  to  reach 
separate  goals. 
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One  important  result  in  both  areas  of  research  has  been  that  the 
accommodation  processes  are  markedly  affected  by  the  magnitude 
of  incentives  or  consequences  involved  in  the  relationship.  For  our 
study  of  interdependent  escape  we  began  with  Alexander  Mintz' 
clever  simulation  of  a  “panic”  situation  (1951).  His  experimental 
task  required  a  number  of  subjects  to  attempt  to  withdraw  cones 
from  a  large  bottle  within  a  short  period  of  time.  The  bottleneck 
would  permit  only  one  person  at  a  time  to  withdraw  his  cone  and 
thereby  to  escape  a  dangerous  flood  of  water  entering  the  bottle  at 
the  bottom.  If  several  persons  attempted  to  withdraw  their  cones  at 
the  same  time,  they  jammed  the  bottleneck  and  risked  being  trapped 
in  the  bottle. 

In  our  experiments,  we  simulated  these  bottleneck  conditions 
with  electric  circuitry.  Each  of  seven  subjects  was  placed  in  a  sep¬ 
arate  cubicle  in  front  of  a  console  such  as  is  shown  in  Figure  9.  At 
the  start  of  a  trial,  each  subject’s  push  button  and  indicator  light  was 
red,  indicating  that  he  was  in  danger.  The  subject  was  told  that 
unless  he  succeeded  in  changing  the  color  of  his  light  to  green. 


Figure  9.  Subject  Console  for  the  Interdependent  Escape  Experiment. 


before  an  indefinite  time  deadline,  he  would  receive  some  sort  of 
punishment.  The  subject  could  attempt  to  escape  from  the  danger 
by  simply  pressing  the  button,  in  which  case  Iris  button  turned 
yellow  and  an  indicator  light  corresponding  to  him  turned  yellow 
on  every  other  subject's  console.  If  one  subject  anti  one  subject  alone 
held  his  button  down  for  three  seconds,  his  light  turned  green  indi¬ 
cating  that  he  had  succeeded  in  escaping  from  the  danger.  On  the 
other  hand  if  two  or  more  subjects  pressed  their  escape  buttons  at 
the  same  time,  the  buttons  became  red  and  yellow  and  this  in¬ 
dicated  a  "jam".  A  jam  meant  that  no  one  succeeded  in  escaping 
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and  valuable  time  was  wasted.  In  short,  it  was  possible  for  the  sub¬ 
jects  to  escape  only  by  pressing  their  buttons  one  at  a  time  and 
waiting  for  each  person’s  three-second  escape  period  to  elapse.  Any 
simultaneous  escape  attempts  or  interruptions  of  another  person’s 
escape  attempts  simply  wasted  time  without  producing  any  escapes. 
The  seven  subjects  were  given  a  certain  time  period  in  which  to 
escape.  They  did  not  know  it  but  this  time  period  was  more  than 
adequate  for  all  of  them  to  escape.  All  who  failed  to  escape  before 
the  time  ran  out  were  then  required  to  suffer  various  negative 
consequences. 

Thus,  at  the  start  of  a  trial,  each  subject  was  faced  with  a  conflict 
between  waiting  for  the  others  to  leave  and  attempting  immediately 
to  leave  himself.  The  indicator  lights  provided  him  with  complete 
information  about  what  the  others  were  doing  but  he  had  no  other 
means  of  communicating  or  coordinating  with  them.  This  situation 
obviously  requires  coordinated  behavior  similar  to  turn-taking, 
specifically,  formation  of  an  orderly  queue  in  which  each  person 
waits  his  turn  to  escape.  Thus,  the  appropriate  measure  of  accom¬ 
modation  is  the  number  of  persons  in  a  group  who  succeed  in  escap¬ 
ing  before  the  time  deadline. 

On  the  basis  of  his  research  with  very  small  monetary  rewards 
and  fines,  Mint/  concluded  that  danger  or  threat  is  not  a  necessary 
factor  in  the  development  of  jamming  and  incoordination.  Our 
results  indicate  that  this  conclusion  is  quite  misleading.  Varying 
the  magnitude  of  the  danger  over  a  fairly  sizeable  range  of  negative 
values,  we  found  that  with  increasing  danger  the  amount  of  inco¬ 
ordination  increased  markedly  (Kelley,  Condry,  Dahlke  and  Hill, 
1965).  The  results  are  shown  by  the  data  summarized  in  Figure  10, 
where  percentage  of  the  group  escaping  is  plotted  against  average 
amount  of  concern  expressed  in  the  group.  This  is  shown  for  groups 
composed  of  men  and  women,  and  under  three  experimentally  in¬ 
duced  degrees  of  threat  (low,  medium,  and  high).  It  is  clear  that 
with  increasing  threat  and  increasing  concern,  the  amount  of  jam¬ 
ming  increases:  fewer  persons  succeed  in  escaping  before  the  time 
limit  is  reached.  It  is  also  dear,  by  the  way,  that  women  are  not 
exactly  the  preferred  accommodation  partners  under  these  circum¬ 
stances.  They  consistently  have  the  lowest  degree  of  success  in 
escaping. 

Mintz'  study  illustrates  that  wrong  conclusions  are  easily  drawn 


INTERPERSONAL  ACCOMMODATION  89 


(ADAPTED  FROM  KEUEY, CONORY,  DAHUCE  AND  Hill.  IMS) 

Figure  10.  Results  from  the  Interdependent  Escape  Experiment. 

through  the  use  of  insufficiently  important  incentives.  His  work  also 
suggests  that  superficially  similar  social  processes  may  result  from 
markedly  different  motivational  conditions.  Subjects  do  create  jams 
without  there  being  any  real  danger,  but  this  probably  reflects  a 
competition  of  an  aggressive  nature  (such  us  is  sometimes  observed 
when  people  queue  up  in  a  ticket  line  or  at  the  exit  of  a  parking  lot 
after  the  ball  game)  rather  than  the  fear-based  panic  observed  under 
extreme  danger. 

Our  research  on  interdependent  approuch  also  shows  the  effects 
of  varying  incentives  in  this  type  of  interdependence  relationship. 
This  work  has  used  the  Deutsch  and  Krauss  trucking  game  (IW2). 
As  shown  in  Figure  1 1,  the  two  subjects  in  this  game  find  it  necessary 
to  share  a  single  path  if  they  are  to  reach  their  respective  goals  in 
the  most  economical  manner.  ‘Hie  two  subjects  operate  two  trucking 
companies,  one  called  Acme  and  the  other,  Holt.  In  making  trips 
from  their  respective  starting  points  to  their  destinations,  the  two 
companies  have  a  common,  one-way  route  and  each  also  has  a  much 
longer  alternate  route.  Heeausc  their  costs  are  a  direct  function  of  the 
time  they  spend  going  from  the  start  to  the  destination,  it  is  in  each 
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ACME'S  ALTERNATE  ROUTE 


Ficuhk  1 1.  Map  for  the  Dcutsdi  and  Krauss  Trucking  Came. 


company's  interest  to  use  the  short,  one-way  road  anti  to  Ik  the  fust 
to  tlo  so.  Hut,  of  course,  they  cannot  both  proceed  by  means  of  the 
one-way  road  at  the  same  time.  If  they  do,  they  block  each  other's 
progress  and  waste  valuable  time.  It  is  also  highly  wasteful  of  time 
for  a  company  to  take  its  long  alternate  route.  Thus,  for  maximal 
profit,  one  or  the  other  must  wait  each  time  while  the  other  uses  the 
short,  one-lane  road.  The  structure  of  their  interdependence  is  again 
one  of  mutual  behavior  control  (as  m  the  case  of  the  interdependent 
escape  problem),  and  accommodation  requires  coordinated  turn¬ 
taking.  Deutsch  and  Krauss  found  that  this  solution  was  achieved 
readily  and  dependably  under  ordinary  circumstances.  However,  it 
did  not  occur  when  the  subjects  were  both  given  additional  means 
ut  interfering  with  other.  These  means  consisted  of  the  gates  (shown 
in  Figure  1 5)  by  which  a  jx  rsoo  could  block  toe  other  person's  way  to 
his  goal,  The  detrimental  effects  ut  the  gates  were  itttetpmetl  by 
Dentsch  and  Krauss  as  indicating  that  the  availability  of  threat 
actions  operates  to  interfere  with  mutual  accommodation. 

This  result  was  obtained  in  an  experiment  where  the  incentive* 
were  quite  insignificant,  consisting  of  smntt  amounts  of  inutgnivy 
money.  Gallo  (l‘J6o)  subsequently  showed  that  these  detrimental 
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effects  of  the  gates  were  greatly  attenuated  when  the  rewards  ob¬ 
tained  upon  reaching  the  goals  were  greatly  increased  in  value. 
Figure  1?  shows  the  results.  It  can  be  seen  that  the  Deutsch  and 
Krauss  results  are  replicated  under  the  low  incentive  (imaginary 
money)  condition,  the  pairs  with  gates  doing  consistently  more 
poorly  than  pairs  without  gates.  However,  when  real  money  is 
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Ftcctus  12.  Gallo's  Results  with  the  Deutsch  and  Krauss  Trucking  Gome. 


involved  the  gates  have  only  a  temporary  effect  and  the  two  condi 
turns  are  hardly  dtitriminable  in  payoffs. 

Thus,  as  in  the  ease  of  Mint#  work,  it  appears  that  conclusions 
about  accommodation  dr.twn  Irottt  experiments  with  trivial  incen¬ 
tives  may  be  quite  misleading.  The  Deutsch  and  Krauss  conclusion 
about  the  denhtwntaf  effects  of  the  gates  is  probably  true  only  when 
the  payoffs  are  relatively  insignificant.  It  goes  without  saying  that 
this  is  hardly  the  situation  about  which  we  are  interested  in  general¬ 
izing.  As  Deutsch  has  correctly  noted  in  response  to  this  point, 
Gallo's  w*wk  shows  tltat  if  tlte  |*owltive  incentives  are  high  enough, 
(tersons  trill  supfuevs  or  inhibit  wfiatever  chsttiptive  motivation  or 
impulses  are  aroused  by  the  in  .'fleriug  actions.  This  raises  die 
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whole  question,  in  attempting  to  generalize  such  findings,  as  to  the 
balance  between  positive  and  disruptive  motivation.  This  is  a  ques¬ 
tion  to  be  answered  empirically  for  any  real-life  situation.  It  is  not 
to  be  taken  for  granted  in  advance  that  the  particular  balance  in 
the  original  Dcutsch  and  Krauss  experiment,  or,  for  that  matter,  the 
different  balance  in  Gallo’s  experiment,  alfords  the  appropriate 
basis  for  generalization  to  any  particular  situation. 

One  might  Ire  tempted  to  conclude  from  our  several  experiments 
that  high  negative  incentives  (such  as  shock  in  the  escape  situation) 
interfere  witli  accommodation  and  high  positive  incentives  (money 
in  Gallo’s  work)  facilitate  it.  However,  an  experiment  by  Daniels 
(1967)  suggests  that  this  conclusion  is  not  justified.  He  found  that 
high  positive  incentives  may  have  either  facilitative  or  disruptive 
effects  depending  upon  other  factors  in  thi  situation.  As  his  inter¬ 
dependence-  problem,  Daniels  used  a  simple  exchange  procedure 
which  amounts  to  a  generalized  and  decomposed  form  of  the  Prison¬ 
er’s  Dilemma.  Varying  the  procedural  rules  and  communication 
conditions,  Daniels  found  that  when  high  value  was  attached  to  the 
items  being  exchanged,  the  two  jicrsons  generally  managed  to 
achieve  a  more  profitable  mutual  accommodation  than  when  the 
items  had  little  significance.  However,  high  positive  incentives 
tended  to  have  the  opposite  died  when  the  procedural  rides  under 
which  the  exchange  took  place  made  it  jtmsifde  lor  a  jterson  who 
made  an  attempt  to  improve  the  exchange  to  Im  doubUvtuvsv-d  and 
exploited  by  Ids  partner. 

The  point  is  made  more  clearly  by  a  subsequent  (as  yet.  unpub¬ 
lished)  experiment  hv  Daniels,  Meeker,  ami  Shore,  conducted  at 
System  Development  Corporation  in  Santa  Monica.  When  exchange 
is  terminated  by  bilateral  agreement,  as  in  classical  bargaining  pro¬ 
cedure,  the  exchange  develops  more  profitably  lot  the  participants 
under  high  incentives  than  under  low  incentives.  In  contrast,  when 
each  exchange  can  Ite  decided  unilaterally,  through  an  action  ol 
either  party  alone,  the  participants  gain  smaller  profits  under  high 
incentive  totuiitions  than  under  low.  This  suggests  the  principle 
that  heightened  incentives  improve  the  quality  ol  the  attained  ac¬ 
commodation  only  when  the  jwriieijwtHs  have  dost?  amt  uni  control 
c#ver  ihe  course  of  the  accoinmmluiivepmess. 

Our  studies  of  the  accommodation  of  conflicting  interests  under 
conditions  of  full  information  and  limited  umnmtnicathm  have  also 
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examined  tacit  communication  processes.  The  experiments  with 
the  situation  invented  by  Deutsch  and  Krauss  have  dealt  mainly 
with  the  consequences  that  blocks,  gates  and  other  “threat”  actions 
have  for  the  development  of  the  appropria'.  coordination  routine. 
Our  research  shows  that  these  actions  and  others  that  constitute 
“threats”  in  the  ordinary  sense  of  the  term  (communications  of 
intent  to  harm  the  other  person)  do  not  necessarily  have  disruptive 
eifects  upon  the  accommodation  process  (Kelley,  1965).  In  fact,  the 
effect  of  such  communication  possibilities  is  sometimes  to  make 
]M>ssible  more  rapid  accommodation. 

Deutsch  and  Krauss  had  reasoned  that  a  player  could  close  the 
gate  to  threaten  the  other  person  with  continued  blocking  or  aggres¬ 
sive  action  unless  he  submitted  to  the  player’s  wishes.  However,  a 
study  by  Gerald  Shurc  and  his  colleagues  at  System  Development 
Corporation  (Share,  Meeker,  Moore  and  Kelley,  manuscript)  shows 
that  a  blocking  action  can  communicate  other  intentions.  They 
studied  an  interdependence  problem  similar  to  Deutsch  and  Krauss’ 
trucking  game— one  that  presents  two  players  witir  a  limited  means 
of  reaching  their  goals  ai:d  requires  as  an  accommodative  solution 
a  simple  turn-taking  procedure.  In  the  absence  of  other  means  of 
communication,  a  blocking  action  (by  which  one  person  could  pre¬ 
vent  the  other  from  using  the  limited  facility)  was  often  interpreted 
as  a  signal  given  to  aid  the  coordination  of  their  actions.  This 
interpretation  was  a  common  one  for  Shore's  subjects  and  when  the 
block  was  interpreted  in  this  manner,  subjects  tended  to  develop 
die  appropriate  turn-taking  accommodation  to  their  situation.  As 
a  consequence,  pairs  who  had  the  block  available  achieved  batter 
outcomes  than  those  without  the  blocks.  In  brief,  under  certain  con¬ 
ditions,  an  action  that  might  scent  to  Ire  aggressive  can  Ire  interpreted 
as  reflecting  u  cooperative  intention.  The  results  of  Shure,  at  «/.. 
are  particularly  interesting  because  at  the  beginning  of  the  inter¬ 
action  the  subjects  tended  to  view  the  block  in  negative  terms  as  a 
hostile,  selfish,  com|rciitivc  move-much  as  Deutsch  and  Krauss  had 
assumed.  However,  in  many  of  the  dyads,  the  block  action  lost  what¬ 
ever  significance  it  had  initially  as  it  threatening  and  disruptive 
move,  and  became  used  regularly  as  a  means  of  signaling  whose  turn 
it. was  tu  use*  the  limited  facility. 

The  present  evidence  is  not  entirety  clear  regarding  the  conditions 
under  which  tire  tacit  communication  significance  of  an  action  can 
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change  in  this  manner.  However,  there  are  several  suggestive  leads 
in  Shure's  research  and  in  research  reported  by  Shomer,  Davis  and 
Kelley  (1966).  Shore’s  results  are  obtained  under  conditions  that 
differ  from  Dcutsch  and  Krauss  in  one  important  respect;  namely, 
the  subjects  had  no  alternative  ways  of  reaching  their  goals.  If  they 
blocked  each  other’s  use  of  the  limited  one-way  facility,  they  could 
not  take  independent  routes  to  their  goals.  In  other  words,  they 
were  forced  to  maintain  an  interaction  with  each  other.  The  impli¬ 
cation  is  that  when  kept  in  the  interdependent  relationship  and 
required  to  persist  in  their  efforts  to  achieve  an  accommodation,  the 
block  move  becomes  useful  as  a  means  of  doing  so.  Consistent  with 
this  view  are  Shure’s  additional  results  from  a  further  experiment. 
When  an  alternative,  independent  facility  was  provided  for  each 
person,  the  blocks  interfered  with  the  accommodation  just  as  in 
the  original  Dcutsch  and  Krauss  experiment.  A  study  by  Davis 
makes  the  same  point,  that  the  interfering  effects  of  the  gates  depend 
upon  there  being  available  alternative  routes  which  permit  inde¬ 
pendent  (though  costly)  attainment  of  individual  goals.  Gallo’s 
results,  noted  earlier,  may  be  given  a  similar  interpretation.  When 
real  money  is  involved,  the  subjects  are  constrained  to  stay  within 
the  relationship  and  not  to  take  their  independent  alternative 
routes.  Under  these  conditions  the  gates  or  blocks  produce  little 
difficulty.  Finally,  an  experiment  by  Shomer  provides  further  evi¬ 
dence  consistent  with  these  various  poiuts:  (1)  that  a  threat  action 
is  not  detrimental  to  establishment  of  coordination  if  interaction  is 
maintained  and  (2)  that  under  these  circumstances  the  threat  action 
often  becomes  used  as  a  coordinating  signal  and  facilitates  the  suc¬ 
cess  of  the  interaction. 

In  sum,  the  indicated  generalization  seems  to  l>e  that  when  other 
means  of  communication  are  not  available  and  when  interdependent 
persons  arc  constrained  to  the  boundaries  of  their  relationship  (and 
are  not  able  to  go  their  separate  ways),  moves  or  actions  that,  under 
other  circumstances,  would  have  negative  significance  may  be  trans¬ 
formed  into  facilitative  communication  devices. 

A  further  interesting  problem  in  ittcit  communication  arose  in 
connection  with  our  work  on  the  minimal  social  situation  which  I 
have  described  earlier.  Although  this  relationship  does  not  involve 
a  conflict  of  interest,  our  study  suggests  a  communication  problem 
that  is  quite  important  in  conflict  settings.  In  our  prior  references 
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to  the  situation  in  Figure  4,  we  were  considering  the  case  in  which 
each  person  had  no  knowledge  of  the  other  in  any  way.  In  the  study 
to  which  reference  is  now  made  (also  reported  in  Kelley,  Thibaut, 
Radloff  and  Mundy,  1962)  each  person  was  given  a  description  of 
the  relationship  which  was  complete  except  for  one  fact,  namely, 
which  of  his  responses  had  which  effect  on  the  other  person.  Thus, 
the  two  persons  were  faced  with  the  problem  of  discovering  how 
they  affected  one  another  and  once  they  discovered  that,  they 
could  easily  accommodate  to  their  mutual  interests.  With  a  mo¬ 
ment's  thought,  the  reader  will  see  how  they  went  about  discovering 
their  effects  upon  each  other.  They  assumed  exactly  what  we  have 
assumed  in  our  analysis  of  this  situation;  namely,  persons  follow  a 
"win-stay,  lose-change”  tendency.  Thus,  if  a  person  likes  what  is 
happening  to  him,  he  will  continue  with  his  ongoing  behavior;  if 
not,  lie’ll  change.  To  apply  this  concept  for  informational  purposes, 
our  subject  merely  pressed  one  button  repeatedly  and  observed  what 
happened.  If  the  other  person’s  rcsjxmsc  was  stable  or  consistent 
(whether  he  gave  a  point  or  shock),  our  subject  could  infer  that  he 
was  making  the  response  that  rewarded  the  other  one.  If,  on  the 
other  hand,  the  other’s  response  was  variable,  returning  now  a 
shock  and  now  a  point,  our  subject  could  infer  that  his  rcsjjonse  was 
shocking  the  other  one.  Once  he  made  this  inference,  our  subject 
could  then  use  his  rewards  and  punishments  to  induce  the  other 
one  to  make  the  proper  rewarding  rcs|Hmse. 

However,  (and  here  is  where  the  example  illustrates  an  important 
point),  this  method  of  discovering  from  the  other  person's  moves 
itow  one’s  own  actions  affect  him  can  not  work  if  both  persons  are 
using  it  at  the  same  time.  For  example,  if  both  persist  in  making 
one  res|M>nsc  tn  order  to  find  out  what  the  other  one  does,  each  cue 
will  conclude  he  has  discovered  the  good  rcs|)onsc  when  in  fact  he 
is  merely  observing  the  other's  similar  attempt  to  gain  information. 
I'he  point  is  that  in  order  that  accurate  information  be  conveyed  by 
this  tacit  means,  it  is  necessary  for  one  of  the  persons  to  have  an 
information-gathering  intention  to  a  greater  degree  than  the  other 
one. 

The  more  general  principle  here  is  that  a  jierson's  ability  to  gain 
information  from  the  other  |>erson’s  moves  dejrends  upon  his  own 
intentions.  Certain  pairs  of  actors'  intentions  are  incompatible  in 
the  sense  that  they  jointly  produce  patterns  of  behavior  that  are 


96  JOURNEYS  IN  SCIENCE 


misleading  from  the  point  of  view  of  tacit  communication.  We  arc 
currently  investigating  within  the  context  of  the  Prisoner's  Dilemma 
relationship  the  problem  of  errors  in  the  inference  of  intention  that 
occurs  as  a  function  of  what  the  two  intentions  are. 

In  devoting  as  much  consideration  as  I  have  to  tacit  communica¬ 
tion,  I  do  not  wish  to  leave  the  impression  that  this  sort  of  com¬ 
munication  by  actions  or  moves  is  highly  effective.  It  is  certainly 
one  of  the  more  common  findings  in  this  area  of  research  that  ac¬ 
commodation  is  faster  and  better  the  more  means  the  interdental- 
ent  parties  have  for  explict  communication.  This  is  shown,  for  ex¬ 
ample,  in  a  number  of  Morton  Deutsch's  experiments  (1962)  though 
not,  surprisingly  enough,  with  the  trucking  game  (Dcutsch  and 
Krauss,  1962).  The  importance  of  explicit  communication  in  the 
interdependent  escajrc  situation  has  been  investigated  by  Arthur 
Hill  (reported  in  Kelley,  Condry,  Dahlke,  Hill,  1965).  Hill's  study 
indicates  the  value  of  enabling  the  less  fearful  mcmltcrs  of  a  group 
to  express  their  lack  of  concern  about  the  danger  situation.  Specifi¬ 
cally,  Hill  gave  the  subjects  a  special  response  to  make  if  they  felt 
confident  and  were  wilting  to  wait,  and  this  response  was  incom¬ 
patible  with  the  escape  response  itself.  The  amount  of  incoordinatcd 
''jamming"  in  the  cscajte  route  was  markedly  lower  when  the  less 
concerned  individuals  were  able  to  make  their  presence  known,  as 
compared  with  a  situation  in  which  their  behavior  was  not  distinc¬ 
tively  or  dramatically  different  from  that  of  worried  hut  uot  yet 
panicked  persons. 

Hill's  finding  raises  an  important  question  about  tacit  communi¬ 
cation  in  situations  such  as  that  of  the  interdependent  estajie  prob¬ 
lem,  where  imitatm  behavior  is  detrimental  to  successful  accom¬ 
modation.  For  example,  many  danger  situations  provide  a  dramatic 
and  obvious  way  for  the  most  frightened  jiersom  to  manifest  their 
reactions  to  the  situation.  ‘Ibis  is  the  escape  response.  On  the  other 
hand,  there  is  usually  no  equally  dramatic  or  obvious  way  for  the 
less  frightened  persons  to  exhibit  their  reactions.  The  calm  (arsons 
are  often  uncertain  as  to  which  of  several  actions  to  take  and  rarely 
do  they  take  the  sort  of  concerted  action  that  the  highly  fearful 
persons  do.  As  a  consequence,  the  latter  tend  to  carry  the  day  and 
to  induce  the  more  undecided  or  neutral  (arsons  to  follow  their 
example.  Hill's  research  si  lows  the  great  value  that  can  come  from 
counteracting  this  situational  bias  by  Itaving  available  for  the  nunc 


IX  TE  It  PE  US  OX  A  I.  ACCOMMODATION  97 


confident  persons  a  single,  obvious,  dramatic  mode  of  making  them¬ 
selves  known,  the.eby  enabling  them  to  present  an  example  to 
offset  that  provided  by  the  concerted  escape  efforts  of  the  most 
frightened  persons. 

The  studies  just  described  deal  with  situations  of  complete  in¬ 
formation  and  limited  communication.  Consider  now  the  third  and 
last  category  of  research— that  on  conflict  of  interest  relationships 
under  conditions  of  limited  information  and  full  communication. 
The  accommodation  processes  observed  under  these  conditions  are 
usually  referred  to  as  bargaining  or  negotiation  processes.  The  in- 
tcrde|Kmdcncc  problems  arc  similar  again  in  general  form  to  Ex¬ 
ample  II  (Figure  2)  and  pose  the  type  of  problem  for  the  partici- 
pants  that  this  relationship  would  |>ose  if  the  two  jrersons  were  not 
]>crmittcd  to  take  turns  but  had  to  agree  upon  a  single  cell  in  the 
matrix  for  the  duration  of  their  relationship.  However,  our  bargain¬ 
ing  problems  arc  considerably  more  complex  than  this  simple  2X2 
example,  involving  more  responses  and  more  gradation  of  outcomes 
and,  therefore,  enabling  a  solution  by  way  of  concession  and  com¬ 
promise.  As  noted  before,  the  interdependence  problem  is  "mixed 
motive"  because  it  is  in  the  two  persons*  mutual  interest  to  avoid 
certain  states  (e.g.,  the  “left-left"  and  “right-right"  combinations  in 
Figure  2),  but  as  between  the  remaining  combinations,  their  prefer¬ 
ences  are  in  conflict. 

Most  imjxntantly,  the  accommodation  occurs  under  conditions  of 
incomplete  information.  We  have  argued  that  the  principal  com¬ 
ponents  of  the  accommodation  process  observed  in  bargaining  can 
be  traced  to  this  state  of  incomplete  information.  This  term  refers 
to  the  fact  th»t  although  bargainers  know  the  general  mixed-motive 
character  of  their  relationship,  they  do  not  know  directly  and  at 
first  hand  certain  important  facts  about  each  other's  positions  and 
values.  Each  person  knows  how  much  he  himself  values  various 
jwssible  agreements  and  how  jroor  air  agreement  he  can  afford  to 
accept  without  finding  it  more  desirable  to  turn  to  other  alternative 
relationships  (other  customers,  employers)  available  to  hint.  How. 
ever,  he  does  not  know  these  facts  as  they  pertain  to  the  other  person. 
He  can  estimate  them,  inquire  about  them,  a-  t!  about  them 

by  his  adversary,  but  in  the  final  analysis  aii  s  ^.formation  is 

indirect  and  open  to  question. 

In  our  bargaining  studies,  as  in  ttatura)  negotiations,  the  parties 
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are  permitted  to  communicate  with  one  another  whenever  and 
whatever  they  wish.  However,  the  conditions  just  described— the 
mixed-motive  relationship  and  incomplete  information— mean  that 
the  participants  themselves  will  place  various  constraints  upon  the 
communication  process.  These  constraints  derive  from  the  several 
dilemmas  they  face  with  respect  to  communication.  The  common 
interest  component  of  their  relationship  makes  it  desirable  for 
them  honestly  to  exchange  all  the  facts  about  their  relationship. 
Only  by  doing  so  can  they  achieve  an  accommodation  at  minimal 
cost,  eliminate  the  risk  of  failing  to  agree  when  it  is  in  their  mutual 
interest  to  do  so,  and  avoid  agreements  that  arc  less  satisfactory  than 
others  available  to  them.  On  the  other  hand,  the  conflict  of  interest 
provides  rn  incentive  for  each  party  to  conceal  or  distort  informa¬ 
tion  about  his  own  values  and  position.  Thus,  it  is  to  each  person’s 
advantage  to  exaggerate  how  well  off  he  would  be  if  they  fail  to 
agree  on  one  of  die  mutually  positive  cells  and  to  play  down  how 
much  he  stands  to  gain  if  they  do  so.  And  as  each  party  feels  the 
temptation  to  be  dishonest  himself,  he  becomes  distrustful  of  com¬ 
munications  from  the  other  person. 

Another  way  to  put  the  problem  is  in  terms  of  dilemmas  the  bar¬ 
gainers  face,  and  to  ask  how  these  dilemmas  are  resolved?  With  a 
mixed  motive  relationship  and  incomplete  information,  each  bar¬ 
gainer  is  faced  widi  dilemmas  concerning  such  matters  as  honesty 
to.  deceit,  trust  vs.  distrust,  and  openness  to.  secrecy.  These  dilemmas 
are  paralleled  by  and  intimately  related  to  other  dilemmas  concern¬ 
ing  the  proper  orientation  toward  the  other  party  (cooperative  to. 
competitive)  and  tiic  setting  of  realistic  goals  for  oneself  (high  to. 
low). 

Tire  processes  by  which  these  dilemmas  are  resolved  have  been 
investigated  through  analysis  of  the  bargaining  behavior  of  subjects 
who  are  dealing  with  important  matters  and  are  relatively  free  from 
time  pressure  in  reaching  their  agreement,  litis  study  (Kelley,  HWb) 
was  conducted  irt  a  classroom  setting  where  students  bargained  for 
scores  that  constituted  half  of  their  course  grades,  lltey  had  op. 
portunities  thoroughly  to  master  the  cognitive  aspects  of  the  prob- 
lent  and  to  rehearse  and  practice  their  procedures.  And  they  were 
given  a  lengthy  time  period  to  reach  agreement  on  a  relatively  sim¬ 
ple  contract. 

The  bargaining  task  required  two  persons  to  negotiate  an  agree- 
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mcnt  covering  five  different  issues.  An  example  of  the  payoff  table 
given  one  party  and  front  which  he  bargained  is  shown  in  Figure 
13.  The  five  different  issues  (items)  are  listetl  across  the  top  of  the 
table  and  the  20  ways  of  settling  each  issue  are  listed  down  the  side 
of  the  table.  The  uble  indicates  for  our  player  the  value  to  him  of 
an  agreement  at  each  possible  point.  The  two  bargainers’  interests 
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ate  in  direct  opposition  on  each  issue  because  the  payoff  table  tor 
the  other  bargainer  runs  in  the  opposite  direction  on  each  item. 
However,  it  can  be  seen  that  not  all  issues  are  equally  inijrortant  to 
our  bargainer.  For  example,  while  it  is  terribly  important  to  our 
bargainer  wltere  issue  $;2  is  settled,  he  is  relatively  indifferent  about 
where  issue  #:>  is  settled.  The  relative  ini|K»rtante  of  the  five  items 
differs  between  the  two  bargainers  and  it  is  this  feature  that  con¬ 
verts  the  relationship  from  a  rero-sum  or  constant-sum  game  to  the 
type  of  mixed-motive  game  we  are  considering  here.  The  bargain¬ 
ing  task  was  considered  completed  only  if  the  two  bargainers  had 
readied  agreement  on  alt  five  items  before  a  time  deadline.  During 
this  period  they  proposed  different  contracts  or  sets  of  agreement 
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points  to  one  another  and  reacted  with  counterproposals,  conces¬ 
sions,  and  so  on.  In  addition,  each  bargainer  was  assigned  a  total 
score  he  would  receive  if  they  failed  to  reach  agreement  before  the 
time  deadline  or  if,  as  was  permitted,  cither  party  unilaterally  broke 
off  negotiations.  For  the  payoff  table  in  Figure  1.1,  the  score  might  lie 
145,  These  scores  provided  each  jxnson  with  a  minimum  guaranteed 
level  which  he  was  certain  to  get  regardless  of  the  other  person’s 
actions.  A  final  essential  feature  of  the  situation  was  that  informa¬ 
tion  defining  a  given  party’s  payoffs  and  his  guaranteed  level  was 
given  only  to  that  ]>erson,  and  the  bargainers  were  not  )>crmittcd  to 
exchange  these  official  payoif  sheets.  During  the  negotiation,  they 
were  free  to  say  anything  they  wished  about  their  values  and  to  use 
any  sort  of  appeal  they  wished.  However,  they  had  no  completely 
convincing  way  of  communicating  their  respective  values  and  in¬ 
terests,  and  thus  were  faced  with  the  problems  of  persuasion  and 
trust. 

This  bargaining  task  was  given  to  advanced  undergraduate  stu¬ 
dents.  It  was  made  im|>ortant  to  them  by  making  part  of  a  course 
grade  dc|Klud  upon  their  success  in  a  series  of  such  negotiations. 
They  had  ample  opportunity  to  study  and  think  about  the  stua 
tion,  and  at  each  session,  they  had  a  75  minute  period  merely  to 
reach  agreement  on  these  five  items. 

The  results  obtained  under  these  rather  ideal  conditions  indicate 
that  the  bargainer’s  behavior  regularly  exhibits  certain  features 
which  can  readily  !>c  interpreted  as  having  functional  value  in  en¬ 
abling  the  bargainers  to  resolve  the  dilemmas  inherent  in  their 
relationship.  That  is,  the  negotiating  behavior  we  observe  seems  to 
provide  means  of  deciding  on  an  acceptable  goal  level,  a  means  ol 
gaining  convincing  information  from  the  other  party  without  tak¬ 
ing  his  statements  at  face  value,  a  means  of  convincing  him  of  your 
own  situation  without  being  totally  honest  and  open,  and  a  means 
of  avoiding  agreement  on  contracts  where  there  exist  better  ones  for 
both  parties. 

These  behaviors  seem  to  fall  into  three  categories;  (1)  avoiding 
early  commitment,  (2)  inducing  the  oilier  to  make  concessions,  and 
(3)  making  economical  concessions. 

The  first  category,  amidhtg  early  commitment,  is  reflected  in  lour 
behaviors:  (a)  starting  high,  (b)  making  exploratory  rather  than  firm 
offers,  (e)  settling  die  entire  contract  as  a  package  rather  titan  on  an 
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item  by  item  basis,  and  (d)  using  all  the  time  available  (bargaining 
right  up  to  the  last  minute).  These  actions  have  in  common  the 
effect  of  keeping  things  open  and  flexible  for  as  long  as  ]x>ssible  in 
order  to  have  maximal  opportunity  to  gain  information.  In  view  of 
his  uncertainty  about  the  situation  and  his  need  for  information  in 
order  to  set  optimal  goals,  a  bargainer  should  not  Irccome  com¬ 
mitted  to  any  aspect  of  the  contract  until  he  has  had  all  |>ossiblc 
opportunity  to  obtain  such  information.  Because  the  other  jrerson 
can  be  expected  to  withhold  or  distort  the  relevant  information  our 
negotiator  can  anticipate  gaining  it  only  through  the  course  of  the 
interchange,  from  the  patterning  and  timing  of  the  other’s  actions 
rather  than  from  his  explicit  statements.  Thus,  the  resolution  of 
both  the  dilemma  of  goals  and  the  dilemma  of  trust  requires  the 
avoidance  of  early  commitment.  The  flexibility  and  tentativeness 
implied  by  these  commitment  avoiding  actions  is  also  useful  in  re¬ 
solving  the  dilemma  of  honesty  versus  dishonesty.  As  long  as  one’s 
own  proposals  are  not  made  in  rigid  or  final  form,  they  may  con¬ 
tain  some  exaggeration  and  distortion  without  running  the  risk  of 
disconfirming  them  through  one's  subsequent  actions  and  thereby, 
of  reducing  the  other  person’s  willingness  finally  to  I  relieve  you. 

The  second  general  category  of  Irchavior,  regularly  observed  in 
our  experiment,  consisted  of  inducing  (he  other  f tarty  to  make  con¬ 
cessions .  This  involved  the  application  of  pressure  through  such 
specific  tactics  as  (a)  making  occasional  negative  concessions  or  re¬ 
treats  in  one's  own  offers,  (b)  using  time  pressure,  and  (c)  {insistent 
registering  of  complaints  that  the  other's  offers  are  inadequate.  The 
concessions  the  opponent  makes  seem  to  be  essential  to  our  bar¬ 
gainer’s  resolution  of  the  dilemmas  of  trust  and  goals.  Each  con¬ 
cession  heightens  his  feeling  that  his  emerging  share  of  the  contract 
is  an  appropriate  one.  Ami  the  iiuicadng  grudgingness  with  which 
concessions  are  granted  undoubtedly  makes  increasingly  credible  the 
other  {tarty's  communications  about  bis  unwillingness  to  go  further. 

Of  course,  making  concessions  is  a  two-way  affair.  A  basic  way  of 
eliciting  a  concession  is  to  make  one.  ‘Huts,  throughout  out  data 
from  this  study  and  from  similar  Irargaining  studies  there  is  over¬ 
whelming  evidence  of  a  turn-taking  or  alternation  pattern  between 
the  two  bargainers  in  their  making  of  concessions.  Hut  in  making 
their  concessions,  our  subjects  regular h  strived  to  Irehave  according 
to  our  third  category,  tltat  is,  to  make  economical  concessions.  An 
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economical  concession  means  one  that  gives  the  other  party  as 
much  as  possible  but  gives  up  as  little  as  possible  tor  one’s  sell'. 
Economical  concessions  are  probably  im|x>rtant  in  resolving  the 
dilemma  of  cooperation.  In  the  attempt  to  make  the  other  jrerson 
satisfied  while  sacrificing  as  little  as  possible  of  one’s  own  values,  ac¬ 
count  is  taken  of  both  parties’  interests  or,  at  least,  of  their  feelings. 
This  amounts  to  adopting  a  quasi-cooperative  orientation  (with 
what  is  often  a  cooperative  effect,  as  we  will  see  in  a  moment)  with¬ 
out  resorting  to  open  and  explicit  coojieration. 

Making  economical  concessions  ap]>ears  in  such  sjrecilu:  tactics  as 
(a)  seeking  information  about  the  op|xment’s  values  and  priorities 
(necessary  if  the  optimal  concessions  arc  to  lie  made  on  an  informed 
basis)  and  in  (b)  trying  out  various  contracts  at  a  given  value  level 
for  one’s  self  (which,  as  we  shall  see  in  a  moment,  makes  jmssihle 
optimal  concessions  on  a  trial  and  error  basis).  Also,  because 
economical  concessions,  if  jKOccivctl  as  such,  are  not  likely  to  lie 
highly  regarded  by  the  op]xmcnt,  our  Irargainers  consistently  (c) 
exaggerated  the  cost  to  themselves  of  concessions  they  made. 

The  attempt  to  find  ways  to  make  concessions  that  are  most  valu¬ 
able  to  the  other  party,  but  are  least  costly  to  one’s  self  has  valuable 
consequences  for  the  joint  welfare.  This  attempt  generates  a  tactic 
that  we  had  earlier  identified  in  Schenitrki’s  analysis  of  the  Siegal 
and  Foiuakcr  bargaining  situation-that  of  testing  several  tliflcreni 
contracts  at  a  given  level,  to  sec  whether  any  are  acceptable  to  the 
other  |M-i'sun.  Indore  making  further  concessions.  This  procedure 
serves  to  reduce  the  likelihood  that  unnecessarily  jroor  contracts 
will  lie  selected,  although  the  Itargainers  are  not  particularly  awaie 
of  this  consequence  of  their  tactics. 

This  process  can  be  illustrated  by  reference  to  Figure  H  which  is 
based  on  Schenitiki’s  work  and  adapted  from  Kelley  (Hkil).  The 
|Mofits  of  erne  player  (who  was  called  the  "seller"  in  Scheuit/ik's 
study)  are  shown  along  the  ordinate,  and  the  profits  of  the  other 
player  (the  "buyer")  aie  shown  a  lung  the  abscissa.  Plotted  on  the 
figure  are  the  various  contracts  the  |ttir  might  agree  iqmii  whic  h 
are  profitable  for  both  (that  is,  above  their  respective  break-oil  or 
ituje|tendent  values).  The  seller  seeks  to  attain  agreement  on  a  con¬ 
tract  at  die  upper  left  of  the  scatter  plot  and  the  buyer  seeks  to 
attain  agreement  at  the  lower  right,  ’lire  question  that  concerns  us 
hero  is  whether,  in  the  pursuit  of  these  interests,  they  will  manage 
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to  agree  on  a  contract  yielding  them  maximum  total  profit  (one  on 
the  outer  line  of  contracts,  usually  referred  to  as  the  ‘‘Pareto-op¬ 
timal"  outcomes)  or  one  that  is  less  than  optimal  tor  the  pair  (one 
of  the  “inner”  contracts).  T  he  answer  would  obviously  be  the 
former  if  both  parties  knew  both  sets  of  values,  but  remember  that 
we  are  considering  the  case  of  partial  infovmation  where  each 
player  knows  only  his  own  profits  for  the  various  contracts,  Schemo- 
ki's  work  shows  that  despite  this  limitation  of  information,  if  the 
negotiators  start  high  and  make  their  concessions  in  a  systematic 
manner,  they  will  agree  on  a  contract  characterized  by  Pareto  op- 
timality.  This  can  be  seen  in  Figure  M  as  follows.  Know  ing  only  the 
profit  each  contract  yields  him,  each  bargainer  has  little  choice  hut 
to  begin  by  pro|»osmg  contracts  lor  which  his  own  profits  are  high. 
As  he  finds  these  to  be  unacceptable,  he  proposes  additional  con 
tracts  yielding  him  somewhat  lower  profits  and  or  on.  As  he  makes 
successive  concessions,  each  time  dropping  his  level  of  aspiration.  Ire 
enlarges  the  set  of  contracts  Ire  considers  acceptable.  Thus,  as  in 
Figure  H,  it  can  be  seen  that  as  time  progress**,  the  hi  rrf  coturacts 
the  buyer  considers  acceptable  becomes  larger,  and  the  same  is  true 
lor  the  seller.  The  question  of  |rredttttng  their  final  agreement  \e- 
solves  into  the  question  of  where  these  two  sets  of  acceptable  com 
uatts  will  first  overlap.  As  drown  in  Figure  M,  the  two  sets  are  like* 
ly  to  intersect  first  on  tire  contracts  located  along  the  outer  edge  of 
tire  total  set.  T  hus,  if  concessions  are  ntade  systematically,  with 
fairly  good  testing  of  the  various  possible  contracts  at  each  profit 
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level  before  moving  on  to  less  profitable  ones,  the  common  interest 
will  be  served  to  maximal  degree.  It  should  be  noted  that  this 
process  of  systematic  concessions  identified  by  Schenitzki  is  possible 
in  the  bargaining  situation  we  have  been  considering  only  if  the 
negotiation  is  conducted  in  terms  of  packages  or  sets  of  items  rather 
than  one  item  at  a  time.  If  the  subjects  formally  settle  one  or  two 
of  the  various  issues  before  dealing  with  other  issues,  they  lose  the 
flexibility  that  is  necessary  for  systematic  exploration  of  each  profit 
level.  They  thereby  reduce  the  likelihood  that  they  will  agree  upon  i, 
the  optimal  contracts. 

The  preceding  observations  refer  to  highly  motivated  and  skillful 
bargainers,  tvorking  under  relatively  little  time  pressure.  In  the  ab¬ 
sence  of  these  rather  ideal  conditions,  we  might  expect  that  the  re¬ 
straints  against  exchange  of  information  would  not  be  adequately 
circumvented  because  there  would  not  be  the  kinds  of  extended 
testing,  probing,  and  exploring  that  has  just  been  described.  If 
bargainers  are  unable  to  exchange  information  about  their  rela¬ 
tionships  (because  their  less  than  maximal  efforts  to  do  so  are 
thwarted  by  their  tendencies  to  withhold  information),  how  do  they 
solve  their  accommodation  problem? 

A  recently  completed  series  of  studies  (Kelley,  Beckman  and  Fis¬ 
cher,  1967)  seems  to  shed  some  light  on  this  question.  Two  parties 
negotiated  the  division  of  a  reward,  again  under  conditions  of  in¬ 
complete  information.  For  each  one,  there  was  specified  a  minimum 
share  of  the  reward  which  it  was  necessary  for  him  to  obtain  if  he 
was  to  make  any  profit  on  the  agreement.  Each  knew  his  own  mini-  - 
mum  necessary  share,  but  not  that  of  bis  adversary.  Each  pair  bar¬ 
gained  on  repeated  problems  with  the  two  minimum  necessary 
shares  varying  unpredictably  from  one  occasion  to  the  next.  The 
stakes  were  not  such  as  to  warrant  extended  bargaining,  and  in  any 
event,  the  time  pressure  was  high  enough  to  preclude  extended 
bargaining. 

These  studies  yield  highly  regular  patterns  of  results  in  terms  of 
such  things  as  opening  bids,  order  of  successive  bids,  rates  of  con¬ 
cessions,  time  to  reach  agreement,  points  of  agreement,  and  failures 
of  agreement.  What  is  particularly  striking  about  the  results  is  the 
degree  to  which  they  are  reproduceablc  by  a  simple  mathematical 
model.  The  details  would  not  be  appropriate  here,  but  the  prop¬ 
erties  of  this  model  suggest  that,  in  the  absence  of  information  about 
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the  situation  of  his  opponent,  the  bargainer’s  behavior  is  based  upon 
general  expectations  he  forms  about  the  relationship.  These  expec¬ 
tations  are  geared  to  the  situation  in  general  and  to  the  typical 
adversary.  As  such,  these  expectations  afford  generally  reasonable 
guide  lines  for  making  accommodative  concessions  while  still  pro¬ 
tecting  one’s  interests.  Their  only  limitation  is  that,  being  adjusted 
to  the  average  or  typical  situation,  they  are  not  appropriate  for 
atypical  cases.  The  consequence  is  that  appropriate  adjustments  are 
not  made  for  these  unusual  cases.  To  make  such  adjustments  would 
require  information  exchange,  and  little  such  exchange  occurs  under 
the  conditions  that  prevailed  in  this  experiment.  The  consequent 
cost  to  the  pair  is  a  high  probability  of  failure  to  reach  accommoda¬ 
tions  in  certain  situations  where  agreement  is  in  their  mutual  interest 
(although  only  marginally  so).  In  this  respect,  these  experiments 
particularly  highlight  the  difficulties  negotiators  encounter  in 
achieving  satisfactory  accommodations  when  their  respective  ex¬ 
pectations  as  to  shares  of  the  reward  are  both  high.  With  open  and 
dependable  communication,  the  expectations  would  be  adjusted  to 
the  two  sets  of  needs  (the  minimum  necessary  shares)  that  the  bar¬ 
gainers  bring  to  their  relationship,  and  the  difficulties  arising  from 
too  hi;..!;  demands  cou'  i  be  avoided.  In  the  absence  of  open  and 
dependable  communication,  the  bargainers  can  evolve  methods  that 
enable  them  generally  to  make  accommodations,  but  these  methods 
yield  failures  of  accommodation  when  their  circumstances  depart 
too  sharply  from  the  average  or  typical. 

SUMMARY 

This  wide  ranging  set  of  experiments  merely  samples  the  total 
domain  of  phenomena  that  eventually  must  be  explored  in  order 
to  fulfill  the  goals  stated  at  the  outset,  namely,  (1)  to  identify  the 
types  of  interdependence  problems  and  their  accommodative  solu¬ 
tions,  and  (2)  to  analyze  the  processes  by  which  accommodations 
are  made. 

The  four  or  five  main  categories  of  factors  to  be  studied  and 
analyzed  in  their  interplay  have  become  fairly  clear  from  the  work 
so  far.  These  are  (1)  pattern  of  interdependence,  (2)  information 
held  by  the  participants,  (3)  their  communication  capabilities,  (4) 
the  incentives  and  motivation,  and  (5)  what  have  been  called  the 


106  JOURNEYS  IN  SCIENCE 

process  constraints :  the  factors  governing  the  evocation  of  responses 
as  to  their  order  and  timing. 

The  research  described  here  is  part  of  the  new  trend  ol  experi¬ 
mental  work  oriented  toward  developing  a  basic  science  relevant  to 
interpersonal  and  intergroup  relations.  The  special  contribution  of 
social  psychologists  to  this  field  has  been  the  development  and  ap¬ 
plication  of  appropriate  experimental  methods.  However,  it  should 
be  emphasized  that  the  experimental  work  does  not  proceed  in  a 
vacuum.  It  is  conducted  with  full  cognizance  of  the  relevant  in¬ 
vestigations  conducted  by  other  scientists,  using  other  methods,  in 
such  fields  as  marriage  and  the  family,  labor  and  management,  inter¬ 
group,  and  international  relations.  The  present  paper  does  not 
permit  even  a  brief  summary  of  the  broad  empirical  background 
within  which  the  experimental  work  proceeds.  The  general  strategy 
of  the  latter  approach  is  to  try  to  capture  the  essential  properties  of 
important  types  of  interpersonal  relationships  and,  through  ex¬ 
ploitation  of  the  special  opportunties  for  control  and  analysis  which 
the  experimental  method  provides,  to  identify  in  detail  the  various 
types  of  accommodative  processes  and  the  factors  that  determine 
their  success  or  failure.  Given  this  detailed  analysis,  observers  of 
natural  and  more  complex  matters  will  be  able  more  easily  to 
dissect  and  understand  their  phenomena  and  to  formulate  policy 
recommendations  regarding  means  of  minimizing  fruitless  conflict 
and  increasing  the  quality  of  achieved  acconmiod'  lions. 
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IV.  An  Approach  to 
Bioengineering 

Y.  C.  B.  Fung 

INTRODUCTION  AND  HISTORICAL  BACKGROUND 

What  is  bioengineering?  Obviously,  bioengineering  is  the  inter¬ 
section  of  biology  and  engineering.  Biology  is  the  science  of  life. 
But  what  is  engineering?  It  is  often  debated  among  engineers.  Re¬ 
cently,  several  societies  held  long  discussions  on  the  goals  of  engi¬ 
neering,  and  a  definition  that  evolved  is  that  “Engineering  is  the 
search  for  and  use  of  scientific  knowledge  to  the  benefit  of  man.”  It 
may  not  be  the  definition  accepted  by  everybody,  but  it  describes 
well  the  activities  of  most  engineers.  In  this  concept  engineering 
differs  from  pure  science  by  the  motivation  toward  application.  The 
difference  between  a  scientist  and  an  engineer  is  largely  psychologi¬ 
cal,  and  many  people  play  the  role  of  both.  Thus,  for  engineering, 
the  method  is  scientific,  the  mode  is  quantitative,  the  dictum  is 
economy,  the  concern  is  human. 

Bioengineering  represents  engineering  directed  at  the  living  world. 
It  encompasses  the  search  for  and  use  of  scientific  knowledge  in 
biology  for  the  benefit  of  man.  It  is  also  biology  directed  toward 
engineering.  A  bioengineer  can  therefore  be  an  engineer  concerned 
with  living  matters  or  a  biologist  motivated  by  applications. 

The  field  of  bioengineering  is  immense.  Obviously,  it  includes 
the  engineering  application  of  biology,  and  the  biological  applica¬ 
tion  of  engineering.  Some  of  the  popular  items  arc  listed  below.  In 
Table  I  are  the  engineering  applications  of  biology.  Bionics,  the 
imitation  of  nature  in  electronic  systems,  is  destined  to  be  an  im- 
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TABLE  I 


Engr.  Applications  of  Biology 


systems 

portant  industry  of  the  future.  Fermentation,  wine  making,  leather 
making,  etc.,  are  as  old  as  our  civilization.  Filtration,  purification  of 
water  with  biological  material,  etc.,  are  other  examples.  In  Table  II 
are  some  of  the  biological  applications  of  engineering.  We  can 
think  of  an  unending  list  of  subjects  useful  to  medicine,  astro¬ 
nautics,  and  environmental  health.  Any  one  of  these  subjects  is  a 
big  field.  But  on  top  of  all  these  is  a  unifying  theme:  the  enlarge¬ 
ment  of  engineering  principles  to  deal  with  biomedical  problems. 
Sec  Table  III.  This  is  the  basic  research  which  accumulates  the 
broad,  yet  detailed,  quantitative  knowledge  anil  understanding 
which  makes  any  applications  possible,  safe,  confident,  economical, 
and  elegant. 

The  rest  of  this  paper  will  be  devoted  to  the  illustration  of  the 
main  theme:  the  enlargement  of  engineering  principles  for  bio¬ 
medical  applications.  I  shall  take  my  examples  from  the  problem  of 
blood  circulation. 

Figure  1  is  a  schematic  diagram  of  the  human  circulation  system. 
The  heart  is  the  prime  mover.  The  arteries  and  veins  are  the 
conduits.  The  capillaries  arc  the  irrigated  fields  where  cells  of  the 
body  obtain  their  food.  Every  cell  in  the  body  must  live  near  a 
blood  capillary.  On  the  average  a  cell  cannot  live  if  it  is  farther  titan 
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about  one-thousandth  of  an  inch  away  from  a  blood  capillary.  Thus 
Krogn  (1922)  gave  the  classical  estimate  that  the  logical  length  of  the 
capillaries  in  the  human  body  is  about  60,000  miles,  and  the  total 


TABLE  II 


Biomedical  Applications  of  Engineering 
“-1  1 


Medical 

Environ. 

Engr. 

Health 

t  Biomechanics 
Prosthesis 
Artificial  organs 
1-External  handling  of 
blood,  tissue, and  organs 

-Radiation  therapy 
-Rehabilitation 
-Medical  Instrumentation 
-Hospital  planning 


Closed 

1 

Open 

Systems 

n 

Systems 

-Bio-astronautics 

-Space  exploration 

_ ^Ecological 

systems 


.Sanitary 

'engineering 

.Air 

pollution 

.Water 

pollution 


[-Systems  Analysis  for  assistance 
to  diagnosis, surgery,  treatment 
and  research 


-Aerospace  medicine 
-Bioelectronies 
-Pharmacological  testing 
-Hypo  and  hyperbaric  effects 
on  living  systems 


TABLE  III 
BIOENGINEERING 


Fiomut  1.  Schematic  diagram  ul  human  circulation  system, 
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surface  area  is  6,300  m2.  Our  well-being  certainly  depends  on  the 
function  of  the  capillaries.  The  brain,  the  heart,  the  muscle,  all 
depend  on  the  capillary  blood  vessels  to  live. 

In  a  movie  of  blood  flowing  in  the  mesentery  one  can  see  the 
arteries,  arterioles,  sphincters,  capillaries,  venules,  veins  and  the 
red  blood  cells.  The  red  cells  whiz  by  in  the  capillaries  at  a  speed  of 
about  1  to  2  mm  per  second.  These  red  blood  cells  are  very  flexible. 
They  assume  all  kinds  of  shapes  in  their  course  through  the  blood 
vessels. 

Now,  there  are  many  problems  associated  with  these  flows  whose 
solutions  require  advanced  continuum  mechanics.  A  number  of  the 
greatest  masters  contributed  to  this  field.  For  example,  in  Table  IV 
I  compiled  a  list  of  names,  which  looks  like  one  lifted  from  the 
history  of  engineering  mechanics. 

William  Harvey,  of  course,  is  credited  with  the  discovery  of  blood 
circulation.  He  achieved  this  discovery  in  1616  by  many  critical  ob¬ 
servations  and  by  logical  reasoning.  Having  no  microscope,  he 
never  saw  the  capillary  blood  vessels.  This  should  make  us  appre¬ 
ciate  his  reasoning  power  even  more  deeply  today  because,  without 
the  capability  of  seeing  the  passage  from  the  arteries  to  the  veins, 
the  discovery  of  circulation  must  be  regarded  as  "theoretical".  The 
actual  discovery  of  capillaries  was  made  by  Marcello  Malpigi  (1628- 
1694)  in  1661,  forty-five  years  after  Harvey  made  the  capillaries  a 
logical  necessity. 

A  contemporary  of  William  Harvey  was  Galileo.  You  remember 
that  Galileo  was  a  student  of  medicine  before  he  became  famous  as 
a  physicist.  He  used  his  pulse  to  determine  the  constancy  of  the 
period  of  a  pendulum,  and  then  used  the  pendulum  to  measure  the 
pulse  rate  of  people,  expressing  the  results  quantitatively  in  terms 
of  the  length  of  a  jrendulum  synchronous  with  the  beat.  He  invented 
the  thermoscope  and  was  also  the  first  one  to  design  a  microscope  in 
the  modern  seme  in  1609,  although  rudimental  microscopes  were 
first  made  by  J.  Jansen  and  his  son  Zacharias  in  1590.  Robert  Boyle 
studied  the  lung,  and  discussed  the  function  of  air  in  water  with 
respect  to  fish  respiration,  Robert  Hooke  gave  us  the  Hooke's  Law 
in  mechauics,  and  the  word  "cell"  to  biology  to  designate  the  ele¬ 
mentary  entities  of  life.  His  famous  biological  book,  "Micrographia" 
(1664),  was  reprinted  recently  by  Dover  Publications,  leonltatd 
Euler  wrote  a  definitive  paper  in  1775  on  tire  propagation  of  waves 
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in  arteries.  Thomas  Young,  who  gave  us  the  Young’s  Modulus,  was 
a  physician  in  London.  He  worked  on  the  wave  theory  of  light  while 
he  was  concerned  with  astigmatism  in  lenses  and  in  color  vision. 
Poiseuille,  while  he  was  a  student,  invented  the  mercury  manometer 
to  measure  the  blood  pressure  in  the  aorta  of  a  dog  and  discovered 
Poiseuillc’s  law  of  viscous  flow  upon  graduation. 

Table  IV 

A  List  of  Early  Contributors  to  Biomechanics 

Galileo  Galilei  (1 564- 1642) 

William  Harvey  (1578-1658) 

Robert  Boyle  (1627-1691) 

Robert  Hooke  (1635-1703) 

Leonhard  Euler  (1707-1783) 

Thomas  Young  (1773-1829) 

Jean  Poiseuille  (1799-1869) 

Herrmann  von  Helmholtz  (1821-1894) 

Adolf  Fick  (1829-1901) 

Diederik  Johannes  Korteweg  (1848-1041) 
Horace  Lamb  (1849-1934) 

Otto  Frank  (1865-1944) 

Balthasar  van  der  Pol  (1889-  ) 
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To  von  Helmholtz  might  go  the  title  “Father  of  Bioengineering". 
He  was  a  professor  of  physiology  and  pathology  at  Konigsberg,  pro¬ 
fessor  of  anatomy  and  physiology  at  Bonn,  professor  of  physiology 
at  Heidelberg,  and  finally  professor  of  physics  in  Berlin  (1871).  He 
wrote  his  paper  “Law  of  Conservation  of  Energy”  in  barracks  while 
he  was  in  military  service,  fresh  out  of  medical  school.  His  contribu¬ 
tions  ranged  over  optics,  acoustics,  thermodynamics,  electrody¬ 
namics,  physiology  and  medicine.  He  discovered  the  focusing  mech¬ 
anism  of  the  eye,  and  following  Young,  formulated  the  three-color 
theory  of  color  vision.  He  Invented  the  phakoscope  to  study  the 
changes  in  the  lens,  the  ophthalmoscope  to  view  the  retina,  the 
ophthalmometer  for  measurement  of  eye  dimensions,  and  the  stereo¬ 
scope  with  interpupillary  distance  adjustments  for  stereo-vision.  He 
studied  the  mechanism  of  hearing  and  invented  the  Helmholtz 
resonator.  His  theory  of  tire  permanence  of  vorticity  lies  at  the  very 
foundation  of  modern  fluid  mechanics.  His  book  “Sensations  of 
Tone"  is  popular  even  today.  He  was  the  first  to  determine  the 
velocity  of  the  nerve  pulse,  giving  the  rate  30  meters  per  sec.,  and  to 
show  that  the  heat  released  by  muscular  contraction  is  an  important 
source  of  animal  heat. 

The  other  names  on  the  list  are  equally  familiar  to  engineers.  The 
physiologist  Fick  was  the  author  of  Fick’s  law  in  mass  transfer.  The 
hydrodynamicists  Korteweg  (1873)  and  Lamb  (1898)  wrote  beauti¬ 
ful  papers  on  wave  propagation  in  blood  vessels.  Frank  worked  out 
a  hydrodynamic  theory  of  the  heart.  Van  der  Pol  (1929)  wrote  about 
the  modeling  of  the  heart  with  nonlinear  oscillators,  and  was  able 
to  simulate  the  heart  with  four  Van  tier  Pol  oscillators  to  produce 
a  realistically-losing  elect*  '^cardiograph.  The  list  would  become 
too  long  if  we  were  to  continue  further;  it  is  perhaus  sufficient  to 
show  that  there  were,  and  of  course  sre,  people  who  -.you  hi  be 
equally  happy  to  work  on  the  living  as  well  as  the  nonliving 
subjects. 

THE  RED  BLOOD  CELL-AN  EXAMPLE  OF  THE 
APPLICATION  OF  ENGINEERING  MECHANICS  TO 
BIOLOGY 

I  shall  now  outline  a  problem  which  we  worked  on  recently.  This 
ts  concerned  with  the  mast  important  unit  of  the  ctrculatton-the 
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red  blood  cells.  These  cells  contain  hemoglobin,  which  perform  the 
task  of  oxygenation— taking  up  oxygen  in  the  lung  and  passing  it  to 
the  tissues  of  the  body.  When  the  red  cells  float  in  a  stationary  bath 
under  the  microscope,  they  ap])ear  as  biconcave  disks.  At  the  top  of 
Figure  2  are  shown  two  views  of  a  red  cell  of  a  rabbit  under  a 
microscope.  The  diffraction  pattern  was  caused  by  the  finite  thick¬ 
ness  of  the  red  cell  and  by  interference  of  light  whose  wave  length 
(0.55  micra)  was  about  1/7  the  radius  of  the  red  cell.  The  typical 
shape— a  donut  without  a  hole,  is  unmistakable.  Viewed  as  struc¬ 
tures,  they  are  thin-walled  shells  with  a  radius-to-thickncss  ratio  of 
order  400.  The  red  cells  of  many  mammals  are  about  the  same  si/e. 
Human  red  cells  have  no  nuclei,  hence  they  cannot  divide  and 
multiply.  The  dimensions  of  a  human  red  cell  in  unit  of  micron 
(10*4  cm)  arc  shown  in  Figure  3.  The  red  cells  are  very  flexible.  The 
lower  photograph  in  Figure  2  shows  the  deformation  of  red  cells 
in  flowing  through  a  capillary  in  the  mesentery  of  a  rabbit. 

The  question  is,  Why  are  the  red  cells  biconcave?  What  conse¬ 
quences  does  the  biconcavity  imply? 

We  can  never  be  too  certain  why  red  cells  look  that  way  They  get 
that  way  in  the  process  of  evolution.  The  red  cells  of  fishes  are  not 
biconcave,  they  have  nuclei,  ami  they  are  larger.  Animals  shed  the 
nuclei  of  their  red  cells  when  they  reach  the  stage  of  mammals.  In 
adult  humans,  our  infant  red  ceils  in  the  bone  marrow  have  nuclei, 
but  they  expel  their  nuclei  when  they  reach  maturity,  and  only 
those  biconcave  red  ceils  without  nuclei  are  allowed  to  enter  the 
blood  circulation.  It  is  easy  to  believe  that  something  good  must 
come  from  that  particular  geometry. 

To  see  what  consequences  the  biconcavity  may  imply,  we  must 
know  the  state  of  the  contents  of  the  red  cells.  Is  it  in  the  state  of  a 
viscous  liquid,  or  is  it  a  gel  solid?  By  definition,  we  call  a  material 
solid  if  it  can  sustain  shear  stress  statically  without  flow.  To  deter¬ 
mine  whether  a  material  is  a  solid  or  not  we  must  pttt  some  toad  on 
it  and  observe  its  deformation.  The  existence  of  an  X-ray  defraction 
pattern  alone  cannot  determine  the  physical  state  because  we  know 
there  are  liquid  crystals  and  there  are  plastic  crystals.  We  know  a 
great  deal  about  the  chemistry  of  the  hemoglobin,  but  we  do  not 
know  its  pl.y*’u.i  state  in  the  red  cell.  This  simple  question  of 
deformahiltty  cannot  be  answered  chenricaity,  nor  by  an  election 
microscope,  which  requires  fixing  (and  killing)  the  oslls. 
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Fuwkc  3.  Uimemium  of  human  red  blood  cell  according  to  K.  Ponder. 

An  indirect  approach  must  be  takeu.  Let  us  make  two  alternative 
assumptions.  First,  assume  that  the  interior  of  the  red  cell  is  a 
liquid  ami  deduce  the  mechanical  projrenies  of  the  red  cell  as  a 
whole.  Then  we  assume  that  the  interior  is  a  gel  and  see  what  dif¬ 
ferences  in  mechauical  properties  may  be  humd.  Finally,  we  com- 
pare  the  theoretical  results  with  all  the  known  experimental  results 
and.  if  necessary,  perform  new  experiments  to  make  a  choice.  The 
knowledge  gained  in  tins  jwocess  will  tie  useful  tor  further  applica¬ 
tions 

Assuming  a  liquid  interior,  we  see  a  red  tell  as  a  thimwalied  shell 
filled  with  a  liquid.  To  analyze  such  a  shell  is  similar  to  tlte  analysts 
of  a  liquid-fueled  rocket;  except  the  red  cell  problem  is  more  drift- 
cult  mathematically.  A  thin-walled  sltell  resists  external  load  by 
“membrane  stresses**,  i.e.,  tire  sties*  resultants  in  the  sltell  wall.  Now, 
if  we  ignore  the  variation  of  stresses  across  the  thickness  of  the  very 
thin  wall,  we  fowl  that  the  partial  differential  equation  which 
describes  tire  stress  distribution  in  the  sited  is  either  of  elliptic  type 
or  of  hyperbolic  type,  depending  on  the  sign  of  the  Gaussian  curva¬ 
ture  of  tire  sltell  surface.  The  differential  equation  is  elliptic  tf  the 
Gaussian  curvature  is  positive:  it  is  hyperbolic  if  the  Gaussian 
curvature  is  negative.  The  Gaussian  curvature  is  the  product  of  the 
two  principal  curvatures  of  the  surface.  For  a  biconcave  shell  at 
illustrated  In  Figure  X  the  Gaussian  curvature  is  positive  in  the 
neighborhood*  of  both  the  equator  and  the  poles  but  is  negative 
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between  the  point  of  inflection  B  and  the  crown  C.  Therefore  the 
differential  equation  for  the  membrane  stress  changes  from  elliptic 
type  near  the  axis  of  symmetry,  to  hyperbolic  type  beyond  the  point 
of  inflection,  and  back  to  elliptic  type  beyond  the  crown  C.  At  B 
and  C  the  Gaussian  curvature  is  zero.  Those  familiar  with  the  aero¬ 
dynamic  theory  of  supersonic  flight  would  recognize  that  our  mathe¬ 
matical  feature  is  exactly  the  same  as  that  of  the  sonic  line  in  a 
mixed  supersonic  and  subsonic  flow.  We  have  all  the  difficulties  of 
the  transonic  flow  theory. 

Nature  takes  a  dim  view  of  mathematical  difficulties.  In  transonic 
flow,  shock  waves  soon  appear.  In  our  red  cell  theory,  the  features  that 
come  to  the  rescue  are  the  bending  and  the  nonlinear  effects,  leading 
to  the  buckling  of  the  thin  shell. 

If  there  is  a  pressure  differential  across  the  red  cell  membrane,  and 
if  there  is  no  buckling,  then  we  shall  encounter  a  singularity  of 
infinite  membrane  stress  at  the  crown.  This  can  be  demonstrated 
by  considering  the  equilibrium  of  a  segment  of  the  red  cell  mem¬ 
brane  as  shown  in  Figure  4.  Here  A  is  the  axis  of  symmetry.  The 
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Figure  4.  Equilibrium  of  a  polar  segment  of  the  cell  membrane. 
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pressure  inside  the  red  cell  is  denoted  by  P,#  while  the  outside 
pressure  is  denoted  by  P0.  If  the  pressure  differential  Pi  -  P„  is 
positive,  the  situation  is  as  shown  in  Figure  4.  The  vertical  resultant 
force  (prPo)  -n-r2  must  be  balanced  by  the  membrane  compression 
2jrrN,p  sin  <p,  where  qp  is  the  angle  between  the  tangent  to  the  mem¬ 
brane  and  the  equatorial  plane.  Now  at  C  the  membrane  is  parallel 
to  the  equatorial  plane,  so  that  qp  =  0,  and  N<p  must  tend  to 
infinity  if  pt-  p0  ¥=0. 

Perhaps  no  biological  membrane  can  develop  such  an  infinitely 
large  stress.  What  actually  happens  when  the  pressure  differential 
prp4)  0  is  that  the  bending  and  the  nonlinear  effects  take  over, 
the  shell  buckles,  and  the  normal  biconcave  shape  will  be  lost. 

We  are  all  familiar  with  buckling  of  thin  shells  under  negative 
pressure,  but  it  is  equally  important  to  remember  that  a  thin  shell 
of  the  biconcave  type  can  buckle  also  under  a  positive  pressure.  It  is 
easy  to  show  by  model  experiments  that  the  equatorial  region  of  a 
biconcave  shell  buckles  into  a  number  of  lobes  as  the  internal 
pressure  is  increased  and  the  pole  region  bulges  out.  This  buckling 
occurs  because  when  the  cell  tries  to  become  spherical  as  the  internal 
pressure  increases,  the  equator  must  be  sucked  in.  The  buckles 
merely  compensate  for  the  reduction  in  equatorial  length.  This  very 
simple  reasoning,  however,  shows  that  the  stress  pattern  in  the  red 
cell  membrane  must  be  quite  complicated  under  such  a  condition. 

We  have  concluded  that  if  the  interior  of  the  red  cell  is  in  a 
fluid  state,  and  if  the  pressure  differential  across  the  cell  membrane 
does  not  vanish,  then  either  the  membrane  stress  Nq>  would  tend 
to  infinity  at  the  crown,  or  the  cell  would  buckle.  But  the  normal 
red  cells  in  the  normal  living  condition  are  seen  under  the  micro¬ 
scope  to  be  perfectly  happy  in  the  normal  biconcave  shape,  without 
tearing  and  without  buckling.  Therefore,  we  have  to  reject  either 
or  both  of  the  hypotheses;  namely,  the  liquid  interior  and  the  finite 
pressure  differential  assumptions.  Sidestepping  the  solid  interior 
hypothesis  for  the  moment,  we  conclude  that  if  the  red  cell  interior 
is  a  viscous  fluid,  then  the  pressure  differential  cannot  be  finite. 
Thus 

Pi-  Po  =  0  (1) 

or,  more  accurately, 

P'w  <  Pi-Po  <P«‘  (2) 
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where  p1*,’ ,  p  are  the  critical  buckling  pressures  under  positive 
and  negative  internal  pressures  respectively.  The  calculation  o£ 
these  buckling  pressures  is  not  easy,  but  an  estimate  for  the  mam-  ,  j 

malian  red  cell  is  that  p'&’is  of  the  order  of  1  mm  of  water,  p  %  is  4f 

of  the  order  —0.2  mm  water.  Thus  the  internal  pressure  in  the  red  :/l 

cell  in  the  normal .  biconcave  configuration  is  essentially  zero.  '  J 

When  we  look  closer  at  the  mechanics  of  the  red  cell,  we  find  that  ■*? 

there  are  other  important  assets  associated  with  the  biconcave 
geometry.  One  is  the  great  deformability;  the  other  is  the  reduction 
of  stresses  in  resisting  the  wear  and  tear  experienced  by  the  cell  in  j 

its  course  through  the  body,  squeezing  through  the  capillaries,  turn-  ' 

ing  the  corners,  getting  caught  in  the  sphincters,  etc. 

To  explain  this  achievement  let  us  look  again  at  the  whole  cell.  ~ 

Let  us  ignore  the  bending  stresses,  which  are  small  even  at  large  'S 

changes  of  mrvature  because  of  the  very  small  wall  thickness,  which 
is  estimated  to  be  70  to  100  Angstrom.  A  deformation  will  induce 
no  membrane  stress  if  all  the  elements  of  length  on  the  mid-surface  j 

remain  invariant  during  the  deformation.  In  differential  geometry  ■ 

such  a  deformation  is  said  to  be  isometric.  A  surface  is  said  to  be 
applicable  to  another  surface  if  one  can  be  deformed  into  the  other  ■, 

by  continuous  bending  without  tearing  and  without  stretching.  De-  - 

formation  of  a  surface  into  an  applicable  one  is  isometric,  inducing  e 

no  change  in  the  membrane  strain  or  stress. 

It  is  well  known  from  differential  geometry  that  if  two  surfaces  are  v 

applicable  to  each  other,  their  Gaussian  curvature  must  be  the  same 
at  the  corresponding  points.  Applying  this  rule  we  can  determine  i. 

whether  a  given  shell  can  be  deformed  isometrically  into  another  or  p 

not.  For  example,  a  developable  surface  is  applicable  to  another  ’j 

developable  surface.  Thus  a  thin-walled  cylinder  can  be  deformed 
into  a  diamond-patterned  bellows  composed  of  flat  triangles,  as  the 
reader  can  easily  verify  by  rolling  a  sheet  of  paper  into  a  cylinder 
and  crushing  it  longitudinally  by  loading  its  ends.  A  sphere  is  ,.v? 

applicable  to  another  spherical  surface  by  an  inversion,  as  shown  in 
Figure  5.  But  in  either  case  the  total  volume  enclosed  in  the  shell  7$ 

changes  during  buckling.  If  the  enclosed  volume  is  not  allowed  to  ^ 

change,  as  in  the  case  of  the  red  cell  which  is  filled  with  an  incom- 
pressible  fluid,  these  deformations  cannot  occur  without  accom- 
panying  stretching  of  the  membrane.  Here  we  realize  the  beauty  T? 

■)j 

ft 
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Figure  5.  Example  of  applicable  surfaces  obtainable  by  isometric  transfor¬ 
mation.  The  red  cell  can  deform  isometrically  and  isothorically,  so  that 
both  the  volume  and  the  surface  elements  are  unchanged. 

of  the  biconcave  geometry  of  the  red  cell.  There  exist  an  infinite 
number  of  surfaces  applicable  to  the  normal  geometry  (see  Figure 
5).  The  red  cell  can  deform  into  any  one  of  these  without  any  tear¬ 
ing  and  stretching  of  its  membrane. 

This  at  once  explains  the  great  deformability  of  the  red  cell  and 
the  soundness  of  the  cell  design.  The  role  it  plays  demands  that  it 
be  tough  and  flexible.  It  obtains  the  toughness  by  being  able  to 
deform  easily  and  avoiding  the  stress.  Of  course,  the  red  cells  are 
stressed  whenever  the  velocity  gradient  and  the  geometric  conditions 
do  not  permit  them  to  deform  isometrically.  These  “off-design “  con¬ 
ditions  must  be  investigated  separately  in  each  special  case. 

We  may  continue  to  examine  what  happens  when  the  volume  of 
the  red  cell  does  change  as  when  the  cell  is  placed  in  a  hypotonic 
or  hypertonic  medium;  or  what  happens  when  the  surface  tension 
or  the  membrane  .'  vucture  is  changed  when  certain  chemicals  are 
added  to  the  plasma.  We  may  look  into  what  is  to  be  expected  of  a 
ruptured  red  cell,  the  so-called  ghosts.  Finally,  we  may  examine  the 
consequences  of  the  alternative  hypothesis-namely  that  the  interior 
content  of  the  red  cell  is  a  gel  solid.  Analytic  solutions  to  these 
problems  are  important  Ireeause  they  offer  us  the  means  to  determine 
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the  red  cell’s  properties:  such  as  the  distribution  of  the  thickness  of 
the  cell  membrane,  its  stress-strain  law,  its  strength,  its  changes  in 
flow  in  vivo  and  in  vitro,  and  so  on. 

The  theoretical  results  must  be  compared  with  experiments  to 
see  whether  the  hypotheses  are  justified.  Our  theoretical  examina¬ 
tion  has  led  to  a  number  of  experiments  which  are  still  under  way. 
At  this  time,  the  assumption  that  the  normal  red  cell  interior  is  a 
liquid  seems  to  be  in  agreement  with  all  known  facts. 

One  may  ask,  “What  is  the  use  of  such  an  investigation?”  Our 
answer  is,  “We  want  to  know  how  to  handle  the  blood  in  the  body 
and  in  the  heart-lung  machine.  We  want  to  know'  the  causes  of 
damage  to  the  red  cells  and  how  to  avoid  them.  We  want  to  know 
how  oxygenation  is  accomplished  and  how  to  control  i‘.  For  these 
we  want  to  know  how  the  red  cell  is  stressed,  how  it  is  damaged, 
how  it  interacts  with  the  capillary  wall,  how  it  influences  the  blood 
viscosity,  and  so  on.”  Our  interest  in  the  red  cell  mechanics  is  more 
than  academic. 

THE  STRESS-STRAIN-HISTORY  LAW  OF  SOFT  TISSUES- 
INTRODUCTION  OF  A  NEW  MECHANICS 

The  red  cell  problem  considered  above  illustrates  the  type  of 
approach  to  biology  that  an  engineer  takes.  To  illustrate  the  other 
side  of  the  interaction— the  enrichment  of  engineering  science  by 
biology,  it  is  perhaps  best  to  consider  the  properties  of  materials 
that  are  thrust  upon  the  engineer.  Again  consider  mechanics.  In 
engineering  mechanics  we  are  familiar  with  linear  elastic  materials 
which  obey  Hooke’s  law- that  the  stress  tensor  is  a  linear  function 
of  the  strain  tensor.  This  law  describes  admirably  well  most  struc¬ 
tural  materials  below  the  yielding  point.  Beyond  the  yielding  point 
most  metals  show  the  phenomenon  of  plastic  yielding.  Under  re- 
jjeated-s tress  cycles  most  structural  materials  dissipate  energy  and 
deviate  from  Hooke’s  law,  leading  to  the  so-called  viscoelasticity 
phenomenon.  In  recent  years  rubbery  materials  became  important 
and  the  theory  of  finite  deformation  became  a  strong  new  branch 
of  mechanics.  In  bioengineering,  we  have  to  consider  living  tissues. 
There  are  body  tissues  that  may  be  described  by  these  familiar  laws. 
However,  there  are  other  tissues  that  require  an  entirely  new 
description. 
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Let  us  consider  the  soft  tissues  of  our  body,  such  as  the  skin,  the 
muscle,  the  blood  vessels,  and  other  connective  tissues.  It  is  obvious 
that  the  mechanical  properties  of  these  tissues  are  important  to 
problems  of  physiology,  pathology,  medicine,  diagnosis,  and  pros¬ 
thetics.  We  shall  discuss  in  particular  the  properties  of  the  mesen¬ 
tery,  which  is  a  thin  membrane  connecting  the  intestines.  The 
mesentery  is  a  favorite  tissue  of  the  physiologist;  it  is  transparent,  of 
uniform  thickness,  and  has  a  microcirculation  system  that  can  be 
easily  seen  under  the  microscope.  Much  of  our  knowledge  about 
microcirculation  is  obtained  from  the  mesenteric  flow. 

It  is  well  known  that  smaller  blood  vessels  are  stiffer.  The  prin¬ 
cipal  reason  for  this  stiffness  variation  is  simply  the  effect  of  the 
size.  For  geometrically  similar  blood  vessels,  the  stiffness,  defined  by 
the  derivative  dp/dR,  where  p  is  the  internal  pressure  and  R  the 
lumen  radius,  increases  as  R  decreases.  However,  for  the  capillary 
blood  vessels  in  the  mesentery,  the  experimental  results  of  Baez  and 
Lamport  indicate  that  the  capillaries  with  diameter  of  order  10 
micra  are  much  more  rigid  than  the  geometric  factor  alone  would 
imply.  To  explain  this  unusually  high  rigidity  of  the  capillary  blood 
vessels  in  the  mesentery,  we  ask  whether  the  tissue  surrounding  the 
blood  vessels  contribute  significantly  to  the  elasticity  of  the  blood 
vessel.  In  the  mesentery  the  capillaries  appear  to  be  embedded  in  a 
gel.  However,  it  is  obviously  difficult  to  measure  the  distensibility 
of  blood  vessels  both  with  and  without  the  surrounding  tissues. 
Hence  it  is  not  surprising  that  few  data  can  be  found  concerning 
this  subject. 

One  way  to  evaluate  the  effect  of  the  surrounding  tissue  is  to 
measure  the  elasticity  of  the  tissue  and  then  calculate  the  contribu¬ 
tions  under  the  hypothesis  that  the  blood  vessels  are  in  direct  con¬ 
tact  with  the  tissue.  For  this  purpose  a  series  of  experiments  was 
made  on  the  avascular  portion  of  the  mesentery  of  the  rabbit.  When 
these  experimental  results  were  applied  to  the  mesentery,  it  was 
shown  that  the  percentage  contribution  of  the  surrounding  tissue 
to  the  total  stiffness  of  the  blood  vessels  is,  for  capillary,  99.7%, 
centrally  located  venule,  61.3%,  arteriole,  45.2%,  eccentrically 
located  venule,  41.7%  and  terminal  artery,  11.8%,  provided  that 
certain  typical  dimensions  and  typical  stretching  of  the  mesentery 
were  assumed.  Although  these  jiercentage  values  would  vary  if  the 
assumed  dimensions  and  stretching  were  varied,  it  is  evident  that 
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the  surrounding  tissue  contributes  greatly  to  the  elasticity  of  the 
blood  vessels.  The  larger  the  effective  size  of  the  surrounding 
tissue  relative  to  the  vessel,  the  greater  is  the  contribution.  The 
capillary,  which  derives  almost  all  of  its  elastic  stiffness  from  the 
surrounding  tissues,  may  be  described  mechanically  as  a  tunnel  in  a 
gel. 

The  analysis  that  leads  to  the  tunnel-in-gel  concept  of  capillary 
is  an  application  of  the  stress-strain  law  of  the  mesentery.  Let  us  now 
turn  to  the  law  itself.  If  a  narrow  rectangular  strip  of  the  avascular 
portion  of  the  mesentery  of  a  rabbit  was  tied  at  the  ends  and 
tested  in  a  tensile  testing  machine,  the  results  may  be  presented  in 
Figures  6  to  8. 

Figure  6  shows  the  load-deflection  curve  of  a  specimen  when  the 
rate  of  strain  imposed  was  0.254  cm  per  minute.  The  ordinate  shows 
the  load  in  grams.  The  abscissa  shows  the  deflection  in  centimeters. 
The  relaxed  length  of  the  specimen  was  Z0=l-22  cm.  When  the 
specimen  was  stretched  from  l0  to  lx  =  2.54  cm,  die  corresponding 


Fteuite  6.  The  load-deflection  curve  of  a  rabbit  mesentery  in  tension.  The 
state  corresponding  to  the  naturally  spread-out  mesentery  is  marked  by  the 
small  circle.  The  point  marks  the  relaxed  length  of  the  specimen. 
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tension  induced  was  very  small;  in  fact  it  was  not  readable  in  the 
chart  illustrated  in  Figure  6.  Extension  beyond  li  however,  induced 
a  rapidly  increasing  tension.  The  load-deflection  relationship  was 
definitely  nonlinear. 

Figure  7  shows  typical  hysteresis  curves  of  the  specimen.  It  is  seen 
that  hysteresis  exists,  but  it  is  not  very  large.  Although  it  is  not  shown 
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m  -  6  rabbit  mes. 

10=0.34“  LphM.QB* 

STRAIN  RATE : 

UPPER.  HIGH:  4 0.J  in./min. 

LOWER, LOW:  4 0.02  in./»in. 

Figure  7.  Hysteresis  curves  of  rabbit  mesentery  obtained  at  different  strain 
rates.  The  high  rate  was  ten  times  that  of  the  low  rate.  Only  slight  change 
in  hysteresis  curves  was  obtained.  Some  of  the  small  difference  is  due  to 
fatigue,  some  is  due  to  strain  rate. 

in  the  figure,  a  completely  unloaded  specimen  gradually  returned 
to  the  initial  length  /«.  In  other  words,  there  was  no  doubt  that  the 
material  was  elast'e,  although  the  modulus  of  elasticity  was  very 
small  in  the  range  to  1%,  The  curve  market)  “high"  was  produced 
at  a  strain  rate  ten  times  faster  than  that  marked  “low".  It  is  seen 
that  the  hysteresis  loops  did  not  depend  very  much  on  the  rate  of 
strain. 

Figure  8  shows  a  stress-relaxation  curve,  llte  specimen  was 
strained  at  a  constant  rate  until  a  tension  T|  was  obtained,  The 
lengdt  of  the  specimen  was  then  held  fixed  and  the  change  of  tension 
with  time  was  plotted, 
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RATE  TO  PEAK  0.5  minim 

Figure  8.  Relaxation  curve.  Rabbit  mesentery.  The  specimen  was  stvesscil 
at  a  strain  rate  of  1.2?  cut  per  minute  to  the  peak,  then  the  moving  heat 
of  the  testing  machine  was  suddenly  stopped  so  that  the  strain  remained 
constant.  The  subsequent  relaxation  of  stress  is  shown. 


There  are  other  features  of  interest.  For  example,  if  the  specimen 
is  loaded  and  unloaded  at  a  fixed  rate  of  strain  between  two  fixed 
limits  of  extension,  the  |>eak  response  decreases  with  the  number  of 
cycles.  It  is  a  form  of  “fatigue**.  Finally,  if  the  specimen  is  strained 
up  to  failure,  the  failure  curve  resembles  that  of  ordinary  structural 
materials.  The  ultimate  load  for  the  specimen  shown  in  Figure  b 
was  about  31  grams,  The  ultimate  strain  at  breaking  «« 
A  s  2.17.  1'he  specimen  failed  by  tearing  at  some  unpredictable 
points.  In  all  these  experiments  the  specimens  were  suspended  in  a 
physiological  solution  at  room  temperature. 

It  is  evident  from  the  curves  shown  in  Figures  6-8  that  the  strew 
strain  relationship  for  the  mesentery  and  the  arteries  is  nonlinear, 
that  the  stress  does  not  depend  on  the  strain  alone,  but  also  on  the 
strain  history.  Let  the  stress-strain  relationship  be  separated  into 
two  parts:  an  elrntk  part  and  a  h istorydepentUnt  part.  The  elastic 
part  defines  a  unique  itress-strain  relationship,  i.e.,  the  elasticity 
of  Ute  material.  The  history-dependent  part  is  Unte  dependent;  it 
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is  related  to  the  hysteresis,  stress  relaxation,  creep,  and  other  non- 
conservative  phenomena.  Thus  we  may  write,  for  die  simple  elonga¬ 
tion, 

T(t)  =  F[A(t)]  +  P[A(t-T);  t,  t]  (3) 

where  T(t)  is  the  tensile  stress  at  time  t  referred  to  the  undeformed 
state,  i.c.,  the  tension  divided  by  the  original  cross  sectional  area, 
and  A(t)  is  the  extension  ratio  which  describes  the  strain.  The  ex¬ 
tension  ratio  is  defined  as  the  length  between  two  points  in  the 
deformed  state  of  the  body  divided  by  the  distance  between  these 
two  points  in  the  undeformed  body.  F[A(t)}  is  a  function  of  tire 
extension  ratio  a  at  time  t:  whereas  P  [A(t-t);  t.t]  is  a  function  of  die 
entire  history  of  the  deformation,  A(t-x). 

Consider  first  the  elastic  response  F(A).  The  most  striking  feature 
of  the  elasticity  of  a  living  tissue  as  seen  from  Figure  G  is  the  very 
small  stress  in  response  to  large  initial  strain.  In  Figure  6  an  ex¬ 
tension  up  to  about  100  percent  of  the  relaxed  length  yields  only  a 
small,  unmeasurable  tension.  However,  for  a  greater  than  2  the 
stress  rises  rapidly,  and  indeed,  exponentially.  When  the  slope  of 
the  elastic  tension-deflection  curve,  dT/dA,  is  plotted  against  the 
elastic  tension  T,  a  remarkable  correlation  exists  which  may  be  fitted 
by  a  straight  line  in  the  first  approximation: 

d~  =  aT ,  (ljASA,)  (4) 

(1A 

Ait  integration  with  the  boundary  condition  T  =  T*  when  A  —  A* 

*"  T  =  T.e***>.  (lsW  (5, 

This  simple  relation  is  remarkable,  indeed.  It  shews  that  the  tensile 
stress  is  an  exponential  function  of  the  extension  ratio.  It  may  be 
drown  that  the  skin,  tire  series  elastic  element  of  the  striated  muscle 
and  the  heart  muscle  follow  the  same  trend.  It  appears  that  tire 
exponential  type  of  material  is  natural  in  the  Urological  world 

Modification  of  this  formula  to  account  for  the  curvature  of  the 
experimental  curve  of  dY/tU  vs  Y  can  be  introduced  easily.  How¬ 
ever,  it  should  he  noted  that  T,  as  given  by  the  exponential  func¬ 
tion  above,  does  not  vanish  unless  a-*—*,  by  definition,  however. 
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we  must  have  T=:0  when  A.  =  1,  which  defines  the  unstrained 
state.  Hence  a  modification  is  necessary  in  order  to  account  for  this 
initial  condition.  This  can  be  done  in  several  ways.  The  simplest 
way  is  to  introduce  a  polynomial  factor  that  vanishes  at  a  =  1. 
Guidance  for  such  a  modification  can  be  obtained  from  the  general 
theory  of  elasticity.  It  can  be  shown  that  if  a  strain  energy  function 
exists,  then  for  an  isotropic  incompressible  elastic  body 
subjected  to  simple  elongation,  the  Lagrangian  tensile  stress  k 
given  by 


T  =  2  (A  - 


JL\/dw  ,  J.3Wv 
A*  *'  31,  +  A  31,' 


(6) 


where  I„  Ia  are  the  strain  invariants. 

When  A  -+  1,  the  zero  factor  must  be  of  the  form  (A  —  if 

the  strain  energy  has  no  singularity  at  the  undeformed  state,  i.e.,  if 
ffW/dlf  3W/3I,  are  finite  and  continuous  at  A  =  1.  Adopting  this 
zero  factor,  we  obtain 


T  =  ..  ,T*r,(A - L)  (» - »•).  (IsJbii,)  (7) 

Here  the  constant  ft  no  longer  has  the  same  simple  meaning  as  a. 
the  slope  of  the  dT/dA  versus  T  curve.  However,  the  exponential 
factor  in  this  equation  is  so  powerful  Unit  as  far  as  the  mesentery  is 
concerned,  £q$.  (5)  and  (7)  plot  out  to  be  almost  the  same  curves, 
with  only  a  slight  difference  between  a  and  ft.  For  other  soft  tissues, 
the  polynomial  factor  may  play  an  important  role  and  cannot  be 
ignored. 

We  note  that,  in  genera),  the  resting  configuration  of  a  soft  tissue 
in  the  body  is  not  die  unstrained  state,  People  with  the  experience 
of  cutting  a  major  artery  often  find  that  the  ver^el  shrinks  away  from 
the  cut,  lire  determination  of  the  unstrained  natural  state  is  a  dif¬ 
ficult  task  in  biological  experiments. 

According  to  these  facts,  it  is  elementary  to  point  out  that  the 
usual  practice  in  the  physiological  literature  to  present  one  number 
for  the  Young's  modulus  of  elasticity  for  a  living  tissue  is  meaning. 
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less.  The  Young’s  modulus  of  a  living  tissue  varies  greatly  with  the 
stress,  ranging  all  the  way  from  almost  zero  to  about  5  X  1®* 
dynes/cm3  for  the  mesentery.  On  the  other  hand  the  slope  at  the 
origin,  a,  and  the  curvature,  2 ab,  of  the  dT/dA  vs  T  curve  are  con¬ 
stant  over  the  physiological  range  for  many  soft  tissues.  These  para¬ 
meters,  together  with  a  specific  tension  (or  Lagrangian  stress)  T*  at 
a  specific  extention  A*,  completely  characterize  the  elastic  curve  for 
the  lower  stress  range.  They  are  the  candidates  for  data  collection 
and  data  presentation. 

Let  us  now  consider  the  history-dependent  part  of  the  stress-strain 
relationship.  In  this  respect,  the  most  striking  feature  of  the  mesen¬ 
teric  response,  as  shown  in  Figure  7,  is  that  the  hysteresis  curves  are 
insensitive  to  the  strain  rate.  Otherwise,  the  relaxation  curve  of 
Figure  8  would  suggest  that  the  relaxation  function  may  be  ap¬ 
proximated  by  an  exponential  function.  However,  any  finite  sum  of 
such  exponents  (discrete  frequency  sjrectrum)  would  imply  a  fre¬ 
quency  dependence  of  the  hysteresis  loop.  Therefore,  the  hysteresis 
and  the  relaxation  curves  taken  together  mean  that  a  discrete  relaxa¬ 
tion  spectrum  is  inadequate.  It  turns  out  that  the  inversion  of  the 
experimental  relaxation  curve  to  determine  the  frequency  spectrum 
is  mathematically  a  problem  of  inversion  of  Laplace  transform  on 
the  real  avis  (the  transform  is  real-valued  and  is  known  only  for 
real  arguments),  and  is  a  rather  tenuous  process-  The  same  relaxa¬ 
tion  curve  may  be  fitted  approximately  by  several  frequency  func¬ 
tions,  and  a  selection  of  the  exact  inverse  would  demand  a  numeri¬ 
cal  precision  of  the  relaxation  function  unobtainable  in  experiments. 
However,  the  choice  of  the  inverse  can  be  based  on  the  hysteresis 
curve.  With  such  a  reasoning,  we  formulate  the  swess-snain-histery 
law  as  follows:  ( 

=*  F(A(t)|  -f  f  £W)d*dr  ($} 

O  o  dr 

where  F|aJ»  the  elastic  response,  is  given  by  the  right  hand  side  of 
Eq.  (7).  The  Integral  is  a  linear  superposition  of  the  past  stress  his¬ 
tory;  «{*)  is  the  frequency  spectrum,  and  is  assumed  to  be  a  con¬ 
tinuous  function  of  the  frequency  *  The  function  •(*)  spreads  out 
tire  characteristic  frequencies  so  that  the  strain  rate  effect  is  spread 
out  in  such  a  way  that  the  hysteresis  becomes  tate  inscnsitive  in  the 
practical  range  of  strain  rates.  The  ‘*semi4ineairM  superposition. 
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which  makes  the  stress  depend  linearly  on  the  stress-history  and 
nonlinearly  on  the  strain-history,  is  a  simplification  without  which 
the  stress-strain  law  wouid  be  much  more  complicated. 

If  the  elastic  stress  F[A(t)]  is  a  step  function,  then  T(t)  is  the 
relaxation  curve,  and  it  is  seen  titat  «(t>)  is  die  inverse  Laplace  trans¬ 
form  of  T(t). 

Without  further  elaboration  on  the  analytical  form  of  the  sj»ec- 
truni  a  (v)  and  the  three-dimensional  generalisation  of  the  one- 
dimensional  stress-strain  law,  we  can  characterize  a  living  soft  tissue 
by  saying  that  the  elastic  stress  is  essentially  an  exponential  function 
of  the  extension  ratio,  and  that  the  viscoelasticity  is  semi-linear, 
whose  relaxation  function  has  a  continuous  frequency  spectrum.  As 
far  as  1  know,  there  is  no  engineering  material  that  behaves  in  this 
way.  Mechanics  of  such  materials  must  be  worked  out  anew.  There 
is  no  catalog  of  existing  solutions  to  boundary-value  problems  of 
such  materials.  Developments  in  this  field  would  undoubtedly  foster 
further  advances  in  nonlinear  mecltanics. 

These  examples  show  that  the  theme  of  bioengineering  as  a  sci¬ 
entific  discipline— the  enlargement  of  engineering  sciences  to  deal 
with  biomedical  jHoblems,  is  not  empty.  Engineers  oiler  ?  new  set 
of  tools-the  methods  of  engineering  analysis  and  syntliesis,  to 
biology  and  medicine. 

In  return,  the  need  to  sharpen  their  tools  to  deal  with  new  prob¬ 
lems  in  bioengineering  would  certainly  enrich  engineering  sciences. 
However,  we  are  witnessing  only  the  sprouting  of  the  seeds;  the 
dowering  and  fruition  would  await  the  efforts  of  the  future. 


V.  The  Organisation  of 
Living  Memory  Systems 

j.  Z.  Young 

It  has  bkun  noticed  that  win*  biologists  are  unwilling  to  answer 
the  question  “What  is  life?"  To  the  layman  it  might  seem  that  this 
should  lx-  the  problem  <hai  most  concerns  biologists,  but  the  lay¬ 
man  forgets  that  science  has  given  tip  trying  to  answer  such  simple 
questions  as  “what  is  heat,"  or  “light,”  or  even  “matter.”  However 
it  is  a  sign  of  the  growing  maturity  of  biology  that  although  we  re¬ 
fuse  to  say  what  life  is,  we  can  now  give  a  rather  precise  and  unam¬ 
biguous  definition  of  living  organisms.  They  are  very  complicated 
systems  and  the  definition  ts  long  and  rather  dull.  It  begins  by 
saying  that  the  systems  are  composed  of  some  Hi  of  the  92  natural 
elements,  mainly  of  carbon,  hydrogen,  oxygen  and  nitrogen.  It  notes 
that  these  are  far  from  being  the  commonest  elements  in  the  earth's 
composition  hut  they  are  the  four  smallest  elements  in  the  Periodic 
System  that  make  stable  electronic  configuration*  by  accepting  l.  2, 
3  or  4  electrons  (see  Wald  1904),  Moreover  carbon,  nitrogen  and 
oxygen  are  the  only  elements  that  regularly  form  double  and  triple 
bonds.  The  four  common  bioeleroems  are  thus  particularly  “suited” 
to  form  nudecutes  and  since  the  elements  are  small  the  molecules 
are  stable,  laving  organisms  are  in  fact  remarkably  stable.  We  ex¬ 
press  this  by  saying  that  dies  conserve  patterns  of  order,  often  for 
many  millions  of  years.  To  quote  one  outstanding  example,  the 
Australian  lung-fishes  are  almost  identical  with  then  ancestor 
Cev&tmim,  which  lived  300  million  yean  ago.  Again,  the  sinews  drat 

f,  2,  IWtV6  O  Pt*j*v»*  e#  Amtmy  <tt  Cettes?  tmfrm. 

HU  mat  inti  Sees  m  tk#  mktrn&f*#  sin **4  fun* tut* 

of  wt%mt  tissm%  fa  tie  uwceui  «*»**»,  since  IfH.  ft. 
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are  alive  today  are  very  little  -different  from  the  shrews  of  the 
Cretaceous  period  80  million  years  ago,  who  incidentally  were  also 
very  like  our  own  ancestors.  ",\- 

Besides  containing  much  that  is  stable,  living  things  are  also  very 
liable  to  change  slightly,  moreover  they  exist  in  great  variety.  This 
may  be  partly  a  reflection  of  the  properties  of  the  element  carbon, 
with  its  four  valencies  and  great  families  of  homologous  -series  of 
compounds,  fatty  acids,  esters,  alcohols,  aldehydes  and  so  on.  Yet  in 
spite  of  their  variety  all  living  organisms  are  built  of  remarkably 
similar  molecules.  There  are  compounds  such  as  nicotinate,  pan¬ 
tothenate,  thiamin,  adenosine  triphosphate  and  many  others  that 
occur  not  only  widely  but  probably  in  all  animals  and  plants.  Elab¬ 
orate  compounds,  such  as  the  porphyrins,  turn  up;  in- slightly  dif¬ 
ferent  forms  as  the  chlorophyll  of 'plants  (combined  with  magnesi¬ 
um)  or  the  haemoglobin  or  cytochrome -of  animals  (combined  with 
iron).  Especially  astonishing  are  the  similarities  of  the  proteins. 
These  are  composed  of  chains  of  ainino  acids,  variously  folded.  But 
only  about  20  of  all  the  possible  amino  acids  are  used  in  animals, 
plants  or  bacteria.  Most  astonishing  of  all  is  the  regulation  of  the 
order  of  living  things  by  what  we  shall  call  the  informational 
molecules,  the  nucleic  acids.  These  serve  to  arrange  the  amino-acids 
in  their  particular  orders  on  the  protein  chains.  It  is  probable  that 
approximately  the  same  combinations  of  thea$  nucleotide  bases 
serve  to  introduce  given  amino-acids  in  viruses,  bacteria  and  man. 
It  is  as  if  the  instructions  of  the  living  systems  were  all  written  in 
the  same  language. 

At  more  elaborate  levels  of  organisation  there  are  further  similar¬ 
ities  between  organisms.  Thus  most  organisms  are  based  on  the  cel¬ 
lular  structure  of  nucleus,  cytoplasm  and  cell  membrane.  The  elec¬ 
tron  micro-cope  has  shown  a  large  range  of  organelles  that  arc  very 
widely  spread  among  animals  and  plants.  The  endoplasmic  reticu¬ 
lum  is  a  system  of  intracellular  channels,  to  some  of  which  are  at¬ 
tached  the  ribosomes  which  are  concerned  dining  protein  synthesis 
to  carry  the  amino-acids  to  their  correct  positions  on  the  proteins. 
The  mitochondria  are  folded  structures  carrying  the  enzymes  con¬ 
cerned  in  making  energy  available.  Cilia  are  little  whips  attached  to 
cells  to  produce  movement  and  they  have  almost  identical  struc¬ 
tures  wh  never  they  occur,  with  a  ring  of  nine  fibres  and  two  more 
in  the  centre. 
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With  all  of  these  facts,  then,  we  can  indeed  define  living  organ¬ 
isms  very  much  more  precisely  than  was  possible  when  all  that 
could  be  said  was  that  they  were  made  of  protoplasm  or  animal 
spirits.  Nevertheless  it  will  be  felt  that  by  such  descriptions  of  the 
composition  of  organisms  we  have  not  come  very  near  to  achieving 
what  could  be  regarded  as  an  adequate  definition  of  living  things. 
And  indeed  it  is  characteristic  of  them  that  they  are  not  entities  at 
equilibrium  and  therefore  cannot  be  statically  defined.  They  are 
open  systems,  continually  interchanging  with  the  environment  and 
maintaining  a  steady  state.  This  active  interchange  is  perhaps  as  near 
as  we  can  come  to  an  abstract  entity  that  we  can  refer  to  as  “life.” 
The  most  characteristic  thing  about  it  is  that  it  continues.  The 
biologist  thus  must  recognise  the  directional  characteristic  of  the 
reactions  that  he  studies.  These  reactions  have  had  the  effect  of 
maintaining  this  astonishing  continuity  of  living  systems  probably 
for  over  2000  million  years.  The  subject  matter  of  biology  demands 
an  enquiry  into  how  this  stability  is  ensured.  It  is  not  only  that  the 
biologist  is  not  ashamed  to  study  this  directional  activity  but  rather 
that  he  is  neglecting  his  subject  if  he  does  not  do  so.  When  a  bio¬ 
chemist  studies  the  reactions  of  the  substances  that  he  finds  in  an 
organism  today,  he  will  sooner  or  later  discover  that  he  has  to  study 
their  history  and  indeed  their  function  or  purpose  in  the  plan  of 
self-maintaining  activities. 

This,  o^spurse,  takes  us  back  to  the  question  of  how  such  systems 
began.  The  origin  of  life  is  a  question  that  is  discussed  by  reputable 
scientists  today  and  wc  can  only  say  here  that  their  tentative  conclu¬ 
sion  is  that  it  began  by  the  formation  of  complex  compounds  in  a 
reducing  atmosphere  over  2000  million  years  ago.  There  is  a  wide¬ 
spread  “belief”  that  this  occurred  by  “natural”  processes  but  there 
is  doubt  as  to  what  these  conditions  might  V.:''c  been  to  account  for 
the  production  of  the  particular  ordered  sequence  of  nucleotides 
that  would  serve  to  produce  an  active  protein.  There  is,  indeed,  in 
this  discussion  an  element  of  "which  came  first,  the  hen  or  the 
egg,"  since  the  nucleotides  can  only  ojierate  with  the  assistance  of 
enzymes  to  build  proteins,  but  the  enzymes  are  themselves  proteins. 
Haldane  has  speculated  that  the  first  self-replicating  nucleic  add 
molecules  might  have  been  able  to  code  for  vibomicfease.  which  is 
one  of  the  simplest  of  proteins  (containing  about  120  amino-adds). 

Even  tire  simplest  self-maintaining  organisms  today,  the  bacteria, 
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are  of  a  complexity  that  is  very  difficult  for  us  to  visualise  or  other¬ 
wise  model.  They  maintain  their  active  balance  and  steady  state  by 
putting  into  operation  at  the  right  rate  some  thousands  of  enzymes, 
which  are  catalysts  that  make  reactions  take  place  at  relatively  low 
temperatures.  It  is  interesting  to  try  to  elucidate  the  principles  by 
which  all  this  activity  is  so  regulated  as  to  ensure  continuity.  We 
may  concentrate  upon  two  of  them.  First  there  must  be  a  repertory  of 
possible  states  of  the  system,  incorporated  in  the  various  nucleotide 
combinations  that  have  survived  because  they  have  made  proteins 
that  have  been  useful  for  the  organisms  in  the  past.  The  biochemist 
will  find  no  explanation  for  the  presence  of  any  one  nucleotide 
combination  unless  he  knows  that  it  codes  for  a  particular  protein 
that  has  played  a  part  in  keeping  the  system  stable  and  may  be 
needed  again.  This  set  of  states  thus  constitutes  the  basic  memory 
record.  It  is  a  representation  of  the  past  in  the  literal  sense  that  it 
allows  the  organism  to  re-present  to  the  environment  a  response  that 
is  appropriate  for  ensuring  continuity. 

The  second  principle  thus  emerges  as  the  power  to  respond  to  the 
environment.  This  depends  upon  some  form  of  detectors,  sensitive, 
for  example,  to  excess  or  to  deficiency.  Thus,  in  the  classic  case,  a 
bacterium  in  the  presence  of  a  new  raw  material  (substrate)  will 
synthesise  an  enzyme  to  make  that  material  available.  Without  go¬ 
ing  into  detail  we  can  say  that  this  depends  upon  having  a)  some 
detectors  or  sensors  of  environmental  conditions,  b)  a  communica¬ 
tion  channel  by  which  to  transmit  to  the  memory  what  we  may  call 
the  information  as  to  what  is  needed  so  that  c)  actions  that  are 
effective  are  performed.  For  bacteria  Jacob  and  Monod  and  other 
biochemists  have  provided  a  lot  of  evidence  about  the  details  of 
this  communication  channel. 

The  fundamental  principles  for  the  maintenance  of  a  stable 
steady  state  system  are  thus  essentially  those  of  communication. 
These  include  the  presence  of  a  selected  memory  record  represent¬ 
ing  events  in  the  past,  sensors  able  to  detect  change  and  to  signal 
to  the  memory  so  that  effective  actions  are  performed.  These  ate 
indeed  obviously  the  minimal  elements  that  must  be  involved  in  a 
steady  state,  but  the  logical  features  of  the  organisation  of  such  a 
system  are  less  clear  than  they  seem.  The  actions  must  be  such  as 
are  compatible  with  the  haute  conditions  of  the  surroundings  and 
this  is  ensured  only  if  dm  actions  forecast  the  conditions.  All  life 
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participates  in  the  character  of  what  v.  Foerster  (1962)  has  called  an 
inductive  inference  computer.  Such  activities  are  particularly  con¬ 
spicuous  in  the  nervous  system,  with  its  memory  and  communica¬ 
tion  networks.  But  this  is  only  a  special  case.  We  shall  now  show 
how  the  same  principles  of  communication  and  control  are  manifest 
in  all  living  activities  and  in  this  way  we  shall  try  to  elicit  the  essen¬ 
tial  features  of  those  principles. 

The  concept  of  “control"  is  not  at  first  easy  to  reconcile  with  the 
physical  scientist’s  approach  to  the  organism  as  a  reacting  system 
like  any  other.  It  can  perhaps  be  expressed  by  saying  that  the  present 
state  of  any  living  system  is  dependent,  like  any  other  system,  (1)  on 
the  surrounding  conditions  (2)  on  its  present  internal  state,  which  in 
turn  depends  on  its  history.  There  is  nothing  unusual  in  this,  except 
that  the  “informational"  molecules  are  of  a  character  such  as  to 
carry  an  unusually  long  and  complicated  record  of  their  history.  So 
we  are  back  at  the  fact  that  the  essential  character  of  living  organ¬ 
isms  is  their  peculiar  components,  but  we  have  added  now  that  these 
lead  them  to  operate  in  certain  unusual  ways,  which  give  the  ap¬ 
pearance  of  foresight. 

The  particular  state  of  a  living  organism  at  any  time  is  thus  de¬ 
pendent  upon  two  sets  of  factors,  the  internal  organisation  that  it 
receives  from  the  past  and  the  constraints  imposed  from  outside  at 
present.  This  double  dependence  ensures  that  the  state  and  actions 
of  the  system  arc  such  as  to  lead  to  its  continuance  (Young  1946). 
Organisms  have  this  property,  of  course,  because  selective  action 
upon  the  memory  in  the  internal  organisation  has  insured  that  it 
“instructs"  the  system  to  proceed  in  that  way,  selecting  “correct" 
resjjonses  from  the  repertory  that  has  been  built  up.  We  have  now 
to  ask  how  living  systems  manage  to  make  "correct"  responses. 

The  capacity  to  make  the  right  adjustments  depends  upon  the 
dynamic  characteristic  of  the  system,  on  the  fact  that  it  is  constantly 
changing.  One  of  the  most  revolutionary  of  all  techniques  in  biology 
has  been  the  demonstration,  by  use  of  isotopes,  of  what  Schocn- 
heimer  called  the  “dynamical  state  of  bodily  constituents."  Tissues 
vary  considerably  in  their  degree  of  permanence,  but  in  the  majority 
there  is  a  rather  rapid  turnover.  Thus  by  injecting  cholesterol 
labelled  with  MC  into  the  yolk  suck  of  day-old  ehieks  it  can  be  shown 
that  this  labelled  material  has  a  half-life  in  the  liver  or  kidney  of 
about  5>/,  days.  Some  constituents  have  shorter  half-lives,  others 
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longer;  for  instance,  the  collagen  of  the  tendons  once  laid  down 
turns  over  little  if  at  all.  Particularly  important  is  the  fact  that  there 
is  a  nuclear  component  that  turns  over  very  little.  After  labelling  of 
rat’s  brains  with  tritium  and  later  separation  by  centrifugation,  it 
was  found  that  about  20%  of  the  material  in  the  nuclei  retained 
its  activity  for  6  months.  This  is  about  the  proportion  of  the  nuclei 
that  is  made  up  of  DNA. 

Great  parts  of  the  living  system  are  thus  continually  changing 
and  this  gives  them  the  opportunity  to  be  adjusted  to  meet  chang¬ 
ing  external  circumstances.  This  continual  replacement  is  the  action 
system  of  the  inductive  inference  computer.  These  continual  home¬ 
ostatic  adjustments,  as  it  were,  compute  the  best  state  for  the  future 
in  fhe  light  of  the  memory  from  the  past  and  the  present  conditions. 
It  is  this  that  ensures  against  the  risks  of  disruption  by  accident, 
which  are  inherent  in  any  highly  organised  system.  We  have  treated 
the  organism  as  a  planned  device  with  mechanisms  available  to 
meet  changes  in  its  surroundings.  These  mechanisms  include  special 
repair  processes  ready  to  meet  the  types  of  damage  to  which  the 
body  is  liable.  We  are  so  familiar  with  repair  systems  that  we  forget 
that  they  are  as  much  part  of  the  physiological  repertory  as,  say,  the 
actions  of  heart  or  lungs.  Clotting  of  the  blood  is  a  particular  case 
where  the  mechanism  is  known  in  some  detail  and  we  also  have 
direct  evidence  from  the  hereditary  disease,  haemophilia,  that  it  is 
genetically  controlled.  There  is  in  fact  a  very  large  range  of  regenera¬ 
tive  powers,  for  example  in  the  skin,  bones,  nerves  or  liver.  We  at  e 
apt  to  think  that  mammals  arc  limited  in  this  respect  compared 
with  lower  animals,  which  can  regenerate  whole  limbs.  But  it  is  al¬ 
most  certainly  a  question  of  selection  of  those  repair  processes  that 
are  effective  in  keeping  the  individual  alive  or  allowing  it  to  repro¬ 
duce,  Mammals  and  birds  do  not  regenerate  limbs  because  if  they 
lost  one,  in  the  period  without  it  they  would  not  be  able  to  com¬ 
pete  in  the  complex  environments  that  they  occupy,  especially  since 
they  are  warm-blooded  and  need  constant  supplies  of  food.  Newts 
can  float  about  in  the  water  even  after  loss  of  a  teg  and  still  make 
good  use  of  a  new  limb,  when  it  grows.  Frog  tadpoles  can  grow  new 
limbs,  but  not  adult  frogs,  which  would  be  too  handicapped  with¬ 
out  them. 

All  these  repair  mechanisms  arc,  in  effect,  social  developments  of 
the  turnover  processes,  which  in  turn  are  special  features  of  the 
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growth  process  to  which  we  must  now  turn.  The  feature  of  living 
systems  that  produces  all  these  adjustments  is  the  tendency  to  repli¬ 
cate— to  grow.  It  would  be  satisfactory  if  we  could  define  this  in 
terms  of  the  capacities  of  the  so-called  self-replicating  molecules  of 
the  nucleic  acids.  These  have,  indeed,  remarkable  properties  of  act¬ 
ing  as  primers  and  organisers  of  the  synthesis  of  others  like  them¬ 
selves.  But  they  cannot,  of  course,  do  it  simply  on  their  own.  They 
replicate  only  as  parts  of  systems  including  other  molecules,  in 
particular  those  of  the  nucleotide  polymerases.  Though  we  cannot, 
therefore,  say  that  we  can  yet  fully  define  the  chemical  conditions 
of  growth  we  can  see  them  much  more  clearly  now  than  was  pos¬ 
sible  even  few  years  ago. 

Given  suitable  conditions,  the  mass  of  matter  in  a  living  yeast 
cell  increases  at  a  constant  rate.  Then  after  every  four  hours  or  so 
the  rate  suddenly  doubles  and  the  cell  divides  (Mitchison  1957).  In 
a  population  of  organisms  such  behaviour  obviously  leads  to  a 
logarithmic  increase  and  this  seems  to  be  the  basic  rule  for  the  syn¬ 
thesis  of  living  matter.  What  has  grown  is  capable  of  growing.  This 
is  the  driving  force  lying  behind  all  the  adaptive  characters  of  life. 
Of  course,  in  organisms  like  ourselves  the  growth  tendency  is 
checked,  so  that  instead  of  becoming  mere  unwieldly  aggregates,  the 
highly  differentiated  body  is  produced,  with  all  its  organs,  each  of 
the  right  size.  But  the  growth  power  is  there  nevertheless  and  again 
becomes  manifest  in  the  repair  processes  such  as  those  we  have  men¬ 
tioned. 

Indeed,  the  growth  process  never  really  stops;  it  only  reaches  the 
steady  state  of  turnover,  in  which  synthesis  is  matched  by  destruc¬ 
tion,  This  is  well  seen  in  tltc  many  tissues,  such  as  the  blood,  whose 
cells  last  only  for  a  short  time  and  arc  then  destroyed.  It  is  also  con¬ 
spicuous  in  tissues  that  arc  especially  liable  to  damage,  such  as  those 
of  the  skin  or  lining  of  the  intestine,  which  are  continuously  re¬ 
placed. 

But,  as  we  have  seen,  turnover  continues  in  the  majority  of  tis¬ 
sues,  and  turnover  is  a  form  of  growth  in  which  synthesis  and 
destruction  within  the  cells  are  balanced. 

This  continued  renewal  of  the  tissues  is  the  mechanism  that  en¬ 
sures  their  effectiveness  as  inductive  inference  computers,  provid¬ 
ing  memory  systems  for  producing  continued  survival.  Like  all 
planned  systems  those  of  tlte  body  are  subject  to  random  errors 
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and  accidents  of  many  sorts.  Some  accidents  are  bound  ultimately 
to  disrupt  the  functioning  of  any  and  every  part  and  of  the  whole 
body.  Repair  processes  are  provided  for  the  accidents  to  which  each 
part  is  prone,  but  the  repair  processes  themselves  are  liable  to 
defect.  Continuity  is  only  ensured  by  repeated  rejection  and  renewal 
of  parts  and  ultimately  of  whole  organisms. 

It  would  be  interesting  to  know  more  of  how  renewal  is  related  to 
daily  functioning  and  to  the  demands  and  stresses  imposed  by  the 
surroundings.  Many  tissues  have  remarkable  powers  of  adaptation, 
obvious  examples  being  the  growth  of  muscles  with  use  and  the 
increased  oxygen  carrying  power  of  the  blood  induced  by  life  at 
high  altitudes.  Presumably,  here,  there  are  receptor  mechanisms 
which  call  upon  specific  synthetic  response  systems.  But  for  us  the 
interesting  feature  is  that  the  muscle  or  the  blood  carry  their 
memories  of  the  past  not  as  cotied  records  or  changes  of  instructions 
but  simply  as  a  change  in  their  functioning  capacity.  They  answer 
the  question  “Have  you  been  as  high  up  as  this  before?”  simply  by 
the  way  they  act.  We  may  find  this  a  valuable  due  when  searching 
for  the  memory  of  the  nervous  system. 

We  begin  to  see  then  some  sort  of  outline  of  the  “plan”,  selected 
through  the  millenia,  by  which  living  things  remain  alive.  It  is  a 
plan  depending  upon  communication  systems  to  produce  adjust¬ 
ments  at  a  series  of  time  scales  so  as  to  ensure  continuity.  Higher 
organisms  such  as  mammals  have  some  very  sensitive  and  rapidly 
adjusting  systems,  for  example,  those  that  regulate  the  heart  beat, 
respiration  and  other  processes  that  we  ordinarily  call  physiological. 
The  communication  is  here  through  the  nervous  system,  which  in¬ 
cidentally  contains  sections  operating  on  different  time  scales,  much 
slower  for  internal  ("autonomic")  than  external  ("somatic")  events. 
Rather  slower  and  longer  lasting  adjustments  are  made  by  the 
chemical  signalling  systems  of  the  endocrine  glands.  The  hormones 
are  more  suitable  than  nerve  impulses  to  control  adjustments  that 
continue  over  a  long  |>eriod  and  involve  changes  at  points  with 
addresses  in  many  parts  of  the  body. 

.Still  longer  term  adjustments  are  those  adaptations  we  have  been 
considering,  by  which  muscles  and  bones  grow  strong  with  use  or 
one  kidney  or  other  gland  becomes  larger  if  another  fails  and  so  on. 
These,  like  the  simpler  physiological  adjustments,  depend  upon  spe¬ 
cial  mechanisms,  allowed  for  in  the  instructions. 
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You  will  notice  'hat  as  we  proceed  to  longer  time  scales  we  pass 
to  events  whose  occurrence  is  less  likely.  Changes  in  heart  rate  are 
certain  to  be  demanded  of  every  organism,  as  also  are  regulation  of 
feeding  and  excretion.  Hormonal  mechanisms  are  called  into  play 
repeatedly,  but  irregularly,  as  in  the  stress  responses  of  the  adrenal 
gland,  for  example.  Adaptive  responses  of  muscle,  blood,  bones  and 
glands  are  still  less  frequent  and  vary  more  with  the  activities  of  the 
organism  and  the  accidents  imposed  by  the  environment.  The  repair 
mechanisms  such  as  those  for  bone,  are  even  less  regularly  called 
upon  and  the  body,  as  we  have  seen,  is  only  able  to  effect  repairs  that 
have  reasonably  often  proved  to  be  effective  in  the  past. 

All  of  these  mechanisms  together  ensure  survival,  but  only  for  a 
strictly  limited  period.  If  organisms  depended  only  upon  them,  life 
would  long  ago  have  become  extinct.  The  repair  mechanisms  gradu¬ 
ally  fail  and  senescence  supervenes.  In  a  state  of  nature,  accident 
usually  kills  the  organism  before  old  age  is  manifest.  Even  apart 
from  accident,  repair  mechanisms  are  likely  to  become  less  effective 
with  age,  if  only  because  of  the  decreasing  effectiveness  of  selection 
against  characteristics  that  manifest  themselves  later  (i.e.  after  some 
offspring  have  already  been  born).  We  cannot  here  enquire  into  all 
the  complexities  of  the  study  of  ageing,  but  one  feature  is  of  special 
interest,  namely  the  production  of  errors  in  the  instruction  system  of 
the  DNA.  These  are  bound  to  occur,  perhaps  especially  as  the  mole¬ 
cules  are  used  for  copying.  There  is  indeed  some  evidence  that  errors 
in  the  DNA  are  repaired.  It  is  claimed  that  they  are  recognised  by 
the  "nonsense"  molecules  that  they  produce  and  are  then  removed 
and  replaced  with  the  information  provided  by  the  homologous 
strand  (Falzone  et  al.,  I960).  It  is  stud  that  the  turnover  of  DNA 
increases  in  older  people  but  there  seems  some  doubt  about  this. 

For  our  purposes  the  point  is  to  emphasise  once  more  the  vulner¬ 
ability  to  accident  of  any  planned  system.  The  continuity  of  life 
demands  some  better  insurance  than  planned  repair.  And,  of  course, 
life  has  a  better  insurance.  Continuity  throughout  the  centuries  is 
maintained  by  repeatedly  submitting  the  instructions  to  the  hazards 
of  reproduction.  Whenever  a  cell  divides  there  is  a  period  in  which 
the  genes  are  not  operative.  During  that  time  only  those  metabolic 
systems  that  have  previously  been  produced  maintain  the  life  of  the 
cell.  This  acts  us  a  filter  for  the  instruction  system  that  produced 
them.  If  the  enzymes  are  ineffective  because  of  a  deficit  in  the  in- 
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structions  then  that  set  will  not  be  perpetuated  (Fa bone  ct  al  1966). 
This  selection  is  doubly  vigorous  during  the  haploid  phase  of  sexual 
reproduction,  when  the  insurance  of  the  paired  instructions  is  lost. 
It  is  interesting  that  in  both  plants  and  animals  the  main  operating 
phases  in  which  short-time  adjustments  arc  made  use  diploid  and 
often  polyploid  instructions.  These  provide  some  protection  against 
damage  to  the  DNA.  The  haploid  phase  is  a  short  and  sharp  period 
of  selection.  This  may  be  the  explanation  for  the  vast  production 
of  sperms,  since  in  these  the  organism  is  reduced  to  the  instructions, 
together  with  the  minimum  of  operating  machinery.  Abnormal 
sperms  are  certainly  exceedingly  common  and  the  selection  must  be 
very  severe,  so  that  the  metabolic  machinery  is  tested  to  its  limits 
and  the  DNA  that  produced  it  is  passed  on. 

Reproduction,  then,  is  the  guarantee  against  error  in  the  instruc¬ 
tions.  It  is  significant  that  no  mass  of  living  tissue  continues  indef¬ 
initely  without  division.  Rut  reproduction  docs  more  than  this;  it 
allows  for  the  longest  term  adjustments  of  all,  those  of  evolution,  lly 
mutation,  selection  and  recombination  it  ensures  that  new  sets  of 
Ijossible  actions  are  available  for  selection  according  to  the  condi¬ 
tions  of  the  environment.  The  long  term  changes  that  we  call  evolu¬ 
tion  are  tints  the  continuation  of  the  processes  of  adjustment  that 
begin  with  the  second  by  second  adjustments  of  the  heart  rate.  The 
mechanism  of  the  adaptive  change  is  similar  from  one  end  of  the 
time  scale  to  the  other  in  that  it  depends  on  selection  from  sets  of 
possible  responses.  On  the  evolutionary  time  scale  the  selection  is 
not  accomplished  by  feed-back  through  delicate  sensors  and  commu¬ 
nication  channels,  but  by  the  harsh  facts  of  survival.  But  such  sys¬ 
tems  of  random  selection  among  a  large  number  of  slightly  different 
(Mobilities  allow  some  of  the  most  precise  adjustments  (Platt  1938). 
Our  astonishment  at  the  accuracy  and  intricacy  of  living  activities 
is  a  measure  of  our  rating  of  their  effectiveness  as  systems  of  plan¬ 
ning. 

Thus  living  systems  make  their  forecasts  on  a  great  span  of  time 
scales.  We  tan  now  return  to  examine  in  more  detail  the  process  of 
forecasting  in  the  nervous  system.  Remembering  that,  although  it  is 
a  particularly  ingenious  forecaster,  it  is  only  one  among  the  many 
in  the  body  and  that  the  methods  that  it  adopts  are  s)H>cial  develop¬ 
ments  of  those  that  are  used  throughout.  H  has  been  emphasised 
that  all  parts  of  the  body  make  their  forecasts  by  continuously  ad* 
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justing  their  action  systems  in  the  light  of  information  received  by 
sonic  system  of  sensors  of  surrounding  conditions.  This  feature  of 
organisation  as  a  communication  channel  is,  of  course,  particularly 
characteristic  of  the  nervous  system.  Indeed,  when  we  sjieak  of  com¬ 
munication  in  the  body  we  arc  apt  to  think  of  nervous  communica¬ 
tion  rather  than  any  other  sort. 

An  especial  feature  of  the  nervous  communication  is  that  distant 
parts  of  the  body  are  put  into  communication  with  each  other, 
whereas  the  communications  we  have  been  shaking  of  hitherto  have 
been  from  one  part  of  a  cell  to  another  or  among  neighbouring 
cells.  However,  the  blood  and  its  hormones  also  provide  a  wide¬ 
spread  system. 

The  signals  carried  in  the  nervous  system  are  very  precisely  ad¬ 
dressed.  The  network  is  laid  down  under  the  influence  of  heredity, 
controlling  the  operations  of  embryonic  development.  These  provide 
nervous  channels  by  which  appropriate  actions  are  taken  to  main¬ 
tain  life.  First,  as  a  child  develops,  by  the  simple  processes  of 
respiration,  digestion  etc.,  followed  by  more  complex  actions,  sitting 
up,  walking,  talking  and  so  on.  The  effectiveness  of  these  is  ensured 
by  a  great  multiplicity  of  channels.  In  the  nervous  system  each  chan¬ 
nel  is  the  axon  or  output  fibre  of  a  nerve  cell  and  it  usually  carries 
only  information  of  one  type.  The  signals  are  the  nerve  impulses, 
all  alike  in  any  one  fibre,  varying  only  in  frequency  and  pulse  dis¬ 
tribution.  These  impulses  propagate  electrically  in  an  all-or-nothing 
manner  to  the  end  of  the  nerve  fibre.  Here  connection  is  made  either 
with  the  receiving  end  of  another  nerve  cell  or  with  a  muscle  or 
glaud.  Single  signals  do  not  usually  pass  at  these  gates,  the  synapses. 
Mostly  it  requires  a  particular  pattern  of  frequency  in  one  fibre,  or 
distribution  of  impulses  in  several  incoming,  pre-synaptic  fibres  to 
produce  an  outgoing  signal  in  the  post-synaptic  cell.  Moreover  the 
transmission  from  pre*  to  post-synaptic  is  usually  mediated,  not  elec¬ 
trically,  but  by  secretion  of  a  transmitter  substance.  These  include 
amines,  such  as  adrenaline,  uor-atlrenaline  and  a-hydtoxytrypta- 
mine,  amino  acids  such  as  gamma  aminobotyric  acid  and  glutamine, 
and  the  choline  ester  acetylcholine.  Some  of  these  act  as  exciters, 
others  as  inhibitors  of  the  post-synaptic  cell.  The  balance  of  these 
effects  continues,  as  it  were,  the  act  of  computing  by  which  the 
nervous  system  decides  what  actions  are  to  be  taken.  Particular  pat¬ 
terns  of  input  in  groups  of  afferent  (sensory)  nerve  fibres  produce 
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appropriate  motor  outputs.  When  you  tread  upon  something  sharp, 
Uie  signals  in  the  receptor  fibres  that  arc  activated  arc  so  routed  that 
shortening  occurs  in  the  flexor  muscles  of  the  leg  that  was  pricked 
and  in  the  extensor  muscles  of  the  other  leg.  Thus  while  one  bends 
the  other  straightens— to  hold  the  body  up.  Moreover,  to  produce 
the  appropriate  action  of  lifting  the  foot  the  relevant  antagonistic 
muscles  must  be  inhibited.  The  flexors  cannot  contract  effectively 
unless  the  extensor  muscles  relax. 

If  the  nervous  system  operated  only  with  these  principles  the 
creature  would  move  like  a  puppet,  ptdled  by  strings.  In  fact,  the 
nervous  system  is  not  just  a  passive  transmitter  of  signals.  It  contains 
many  units  that  arc  continually  active  in  more  or  less  complex  pat¬ 
terns.  Some  of  Uiesc  produce  relatively  simple  rhythmic  actions,  such 
as  those  of  breathing.  Others,  much  more  complex,  maintain  the 
extremely  elaborate  patterns  of  activity  of  the  higher  nervous  cen¬ 
tres.  Wc  understand  relatively  little  of  these  activities  at  present,  but 
it  is  essential  to  keep  in  mind  that  most  of  the  cells  of  the  brain  are 
continually  changing  their  thresholds,  that  is,  their  tcndeucy  to 
send  signals.  These  changes  occur  cidter  because  such  variations  at  e 
built  into  the  composition  of  the  cells  or  under  the  influence  of 
rhythmic  excitation  from  central  systems  with  specific  activating 
functions. 

So  far,  die  actions  of  the  nervous  systems  have  been  described  as 
proceeding  within  an  invariant  network,  developed  tinder  the 
influence  of  heredity.  In  reality  no  nervous  system  remains  static  in 
dm  way,  even  the  simplest  parts  are  modified  by  use.  Probably  it  is 
by  special  developments  of  litis  power  to  be  modified  that  there 
develops  the  memory  system  which  is  the  object  of  our  search. 

Many  facts  indicate  that  nerve  celts  and  fibres  grow  if  they  are 
used  aud  atrophy  and  may  die  if  they  are  left  without  input.  Un¬ 
fortunately,  very  little  is  known  of  the  biochemical  basis  for  these 
so-called  trophic  responses  of  nerve  cells  and  fibres.  This  is  one  of  the? 
lacks  that  leaves  us  in  a  weak  position  for  considering  the  mote 
specialised  memory  processes  that  have  presumably  grown  out  of 
them. 

The  development  of  a  memory  facility  depends  ujton  the  presence 
of  alternative  possible  outputs  following  the  stimulation  of  given 
input  channels.  Of  course,  even  in  a  fully  reflex  system  of  respond, 
activation  of  any  given  channel  varies  according  to  tire  other 
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channels  with  which  it  is  stimulated.  The  response  to  treading  on 
diumb  tacks  with  both  feet  is  not  the  same  as  with  one.  But  for 
Uierc  to  be  an  effective  memory,  Utere  must  be  alternative  possible 
action  whose  probability  of  future  use  can  be  altered  in  the  light  of 
die  effects  for  the  organism  that  have  followed  from  their  use  in 
the  ]iast.  This  means  that  the  memory  system  must  be  provided  with 
some  means  for  identifying  what  have  been  the  results  of  the  actions 
and  this  is,  of  course,  one  of  die  functions  of  signals  such  as  those  of 
taste,  pain  and  other  systems  of  reward.  In  our  work  with  octopuses 
we  have  tried  to  work  out  the  system  by  which  diesc  signals  come 
to  alter  the  probability  of  future  actions.  We  suppose  that  there  are 
classifying  cells  in  the  receptor  system,  each  sensitive  to  sonic  particu¬ 
lar  environmental  change  and  that  these  celts  may  produce  alterna¬ 
tive  actions.  Thus  when,  for  example,  a  horizontal  rectangle  appears 
in  the  visual  field  a  horizontal  classifier  is  activated  and  could  make 
the  octopus  either  attack  it  or  retreat.  In  fact,  an  unfamiliar  object 
excites  a  slow  and  cautious  attack.  .Suppose  the  result  is  a  shock, 
then  die  signals  indicating  that  aversive  behaviour  is  indicated 
must  somehow  not  only  activate  the  system  for  retreat  but  also 
increase  the  probability  that  retreat  will  follow  when  a  similar  rec¬ 
tangle  appears  again-  This,  it  is  suggested,  may  be  the  function  of 
collateral  channels  leading  from  the  retreat  fibres  and  serving  to 
block  the  “attack"  pathway  leading  from  that  classifying  cell,  turn¬ 
ing  it  into  a  cell  ordering  “retreat".  Hie  particular  mechanism  diat 
we  suppose  may  be  operative  is  dial  these  recurrent  channels  acti¬ 
vate  small  cells  with  short  processes  drat  are  specialized  to  begin  to 
emit  an  inhibitory  substance  witen  they  are  appropriately  activated. 
Such  cells  occur  in  the  parts  of  the  brain  that  are  known  to  be 
necessary  for  the  two  separate  memories  of  die  octopus.  One  memory 
is  for  objects  seen  and  another  for  objects  touched.  Moreover.  In  the 
touch  system  it  has  been  shown  that  after  removal  of  all  these  small 
cells  the  animats  can  no  longer  learn  not  to  take  an  object  that  has 
been  associated  with  pain  (Wells  &  Wells  1S5T,  Wells  &  Voung  HkUi). 
'(here  is,  therefore,  some  evidence  that  these  celts  are  associated 
with  an  inhibitory  function.  We  even  have  some  suggestion  of  how 
they'  do  it,  Electronmicroscopy  shows  drat  they  are  packed  with  the 
vesicles  that  have  been  shown  almost  certainly  to  contain  trans¬ 
mitter  substances.  But  these  particular  synaptic  processes  are  pecu¬ 
liar  in  being  in  contact  with  other  processes  oho  containing  synaptic 
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vesicles.  The  suggestion  is  that  when  the  small  ceils  are  activated, 
say  by  signals  of  pain,  they  release  a  substance  that  blocks  or  inacti¬ 
vates  the  excitatory  effects  of  the  fibres  with  vesicles  with  which  they 
arc  in  contact.  Such  an  action  might  be  turned  on  suddenly  by  the 
activation  of  an  enzyme  system,  and  would  then  block  the  unwanted 
pathway  indefinitely.  We  have,  in  fact,  evidence  that  the  octopus 
memory  system  shows  a  considerable  insistence  for  up  to  four 
months,  which  is  a  large  part  of  the  animal's  life.  There  is  also 
evidence  that,  although  the  change  once  it  has  been  made  is  not 
reversed,  if  the  system  is  not  tested  by  use  it  may  scent  partly  to 
"forget”,  in  the  sense  of  making  at  fust  some  wrong  rcsjtonses  when 
tested  with  a  given  figure  after  a  long  time. 

According  to  this  view,  then,  the  change  that  constitutes  the 
memory  consists  in  alteration  of  the  probabilities  of  various  actions 
as  a  result  of  experience  of  their  results.  More  specifically  the  pus- 
sible  outcomes  become  reduced  and  limited.  Situations  that  might 
produce  one  of  several  outcomes,  later,  us  a  result  of  experience, 
produce  only  one.  This  surely  is  not  a  bad  descriptor'  of  the  process 
of  learning  as  we  existence  it  ourselves.  Tits  {xtrticular  feature  of 
human  memory  is  the  cajsacity  to  react  appropriately  to  a  large 
number  of  particular  detailed  situations,  especially  of  course,  those 
concerned  with  speech.  This  brings  us  to  the  prublem  of  whethei 
the  memory  contains  anything  similar  to  an  elaborately  addressed 
record.  We  often  have  the  exjieiience  that  information  is  available 
if  only  we  could  find  it.  Yet  it  is  quite  unlikely  that  the  items  are 
filed  in  a  highly  classified  system  such  as  that  of  a  computer.  Each 
point  in  a  computer  record  contains  "information"  only  in  the 
sense  that  when  consulted  at  the  correct  moment  as  part  o*  a  pro¬ 
gram  it*  "answer"  will  have  some  relevance  to  events  in  the  outside 
world.  Eads  point  can  he  returned  to  a  neutral  state  and  used  over 
again  to  represent  another  sort  of  information.  Nerve  celts  are  not 
like  that.  Each  of  them  has  a  special  relation  to  some  particular  type 
of  event  in  the  outside  world.  This  is*  indeed,  die  feature  that  gives 
to  die  nervous  system  Its  greatest  asseMt  can  take  data  direct  from 
die  world  became  it  lias  literally  a  certain  tiomeomorphrsm  with  it. 
The  receptor  systems  and  Wain  constitute,  as  it  were,  a  matrix  for 
modelling  die  world.  In  its  untaught  state  the  matrix  is  capable  ol 
making  various  models,  or  to  put  it  otherwise,  the  child  can  ham 
to  Use  in  any  of  many  different  world*.  Different,  that  is  in  many 
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respects,  though  the  choke  is  mure  limited  than  we  sttpjiose.  To 
live  a  relatively  effective  life  ail  people  must  walk,  feed  themselves 
with  their  hands,  speak  and  Isehavc  in  more  or  less  conventional 
ways.  We  still  do  not  know  how  inch  of  our  capacity  to  do  these 
things  depends  upon  the  maturation  of  the  nervous  system  under 
heredity  and  how  much  to  later  learning.  The  [lower  to  learn  is 
certainly  itself  an  inherited  capacity.  Rut  there  are  considerable 
limitations  on  what  we  can  learn,  though  our  [towers  are  much 
wider  than  those  of  any  animal.  We  are  not  a  perfect  instrument, 
able  to  learn  anything,  but  at  birth  our  brains  are  more  nearly 
blank  sheets  than  those  of  any  other  animal.  The  essence  of  a  good 
piece  of  writing  paper  is  that  it  does  not  have  anything  written  on 
it  already. 

A  good  computer  can  [lerform  a  wide  variety  o(  operations  u)Km 
a  range  of  numbers.  Yet  it  may  lie  that  even  our  most  complex 
[lowers,  such  as  tin  use  of  language,  dejiend  in  considerable  detail 
on  hereditarily  determined  [lattcrns  of  brain  action.  The  brain  is  an 
ingenious  inductive  inference  computer,  but  one  life  is  too  short  to 
allow  the  accumulation  of  the  immense  amount  ol  information  that 
is  used  in  language.  “The  problem  for  the  child  is  not  the  apparently 
insuperable  inductive  feat  of  arriving  at  a  transformational  genera¬ 
tive  grammar  from  restricted  data,  but  rather  that  of  discovering 
which  of  possible  languages  he  is  being  exposed  to”  (Chomsky 
1967).  In  other  words  Itaving  discovered  that  those  around  him  are 
speaking  Spanish  the  child  at  once  proceeds  to  use  the  rules  of 
Spanish  grammar  or  rather  the  Spanish  version  of  tlte  universal 
grammatical  rules. 

Thus,  as  v.  Foerster  puts  it.  tire  analogy  of  a  written  record  is 
not  a  happy  one.  The  brain  is  not  a  great  big  reference  library,  it* 
which  items  lie  stored  even  if  they  ate  never  referred  to.  It  is  a 
system  for  daily  action,  composed  of  a  Iteirarthy  of  parts,  though  we 
know  little  of  the  physiology  of  how  they  operate.  Tire  actions  it 
takes  tend  to perpetuate  the  individual  and  this  tendency  increases 
as  time  goes  on  and  actions  with  unfavorable  results  are  eliminated. 
This  is  a  marvellous  inductive  inference  computer,  but  unfortu¬ 
nately  like  all  oilier  machines  t*  is  subject  to  wear,  it  operates  well 
for  some  three  seme  years,  with  minor  repairs,  conducted  as  its 
substances  turn  over.  %  that  time  small  defects  for  which  there  rs 
no  repair  begirt  to  accumulate  and  from  these  on  we  trace  to  watch 
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a  decline.  The  penalty  for  storing  information  in  any  one  particular 
piece  of  matter  is  that  it  must  deteriorate  whether  recorded  upon  a 
clay  tablet  or  in  a  brain.  Only  by  rejection  of  that  information  and 
return  to  basic  operating  instructions  is  the  essential  nature  of  the 
system  preserved.  But  man  has  found  a  way  around  this  difficulty 
too.  By  speech,  and  particularly  by  writing,  lie  ensures  that  the 
information  in  each  head  is  not  lost  at  death  but  can  be  communi¬ 
cated  to  thousands  of  others  and  reproduced  by  them.  This  is  the 
capacity  that  puts  mankind  in  a  different  category  than  all  other 
creatures,  for  he  can  store  information  additional  to  that  in  the 
gene  pool.  It  is  a  method  that  has  certainly  provided  him  with 
great  biological  advantages.  He  can  live  in  great  numbers  in  all 
parts  of  the  earth  and  possibly  eventually  in  other  parts  of  the 
universe.  It  remains  to  be  seen  whether  there  are  also  dangers  in 
this  attempt  to  preserve  information  without  the  shuffling  that  has 
been  the  safeguard  of  the  genetic  information  for  so  long.  It  may  be 
that  the  very  fact  of  not  shuffling  provides  a  dangerous  conserv¬ 
atism.  Perhaps  we  find  it  difficult  to  adjust  to  the  dangers  we  see, 
for  the  very  reason  that  we  are  so  thoroughly  well  instructed  by  the 
past.  This  may  seem  paradoxical,  but  two  jwssibly  useful  lessons 
emerge  from  this  approach  to  memory.  One  is  that  the  steady  state 
is  not  maintained  by  operating  according  to  a  fixed  set  of  rules. 
What  die  rules  do  is  to  provide  a  system  with  a  memory  or  repertory 
of  possible  actions  that  have  proved  effective  in  the  past.  It  is  sensi¬ 
tive  to  Us  surroundings  and  communicates  information  about  them 
to  the  memory  so  that  effective  items  are  chosen  from  those  possible. 
The  second  lesson  is  that,  though  all  terrestrial  life  as  we  know  it 
uses  common  materials  and  principles,  yet  tietailed  rules  of  opera¬ 
tion  vary  enormously  and  continually.  The  paradox  of  life  is  that 
constancy  and  stability  depend  upon  diversity  and  change,  anti  that 
these  are  ensured  in  die  last  instance  only  by  death  and  renewal. 
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VI.  Gravitation  and  Relativity 

L.  1.  SCHIFF 

Gravitation  is  obviously  of  great  interest  to  the  Air  Force.  One 
might  even  say  that  it  is  of  central  interest,  for  without  it  the  prob¬ 
lems  faced  would  be  radically  different  from  what  they  are.  But  phys¬ 
icists  do  not  study  gravitation  in  the  hope  of  inventing  an  anti-grav¬ 
ity  device.  Nor  are  they  motivated  fry  the  desire  to  make  life  more 
comfortable  for  those  who  will  occupy  the  Manned  Orbiting  Labor¬ 
atory— the  use  of  centrifugal  force  as  a  gravity  substitute  was  under¬ 
stood  long  ago  by  Newton.  Rather,  scientists  of  all  kinds,  in  and 
out  of  the  Air  /orcc,  have  three  principal  objectives.  First  ami 
foremost,  they  attempt  to  improve  human  understanding  of  the 
natural  world.  Second,  they  often  find  in  the  process  that  technology 
is  advanced  as  a  by-product.  Finally,  they  help  make  new  knowledge 
available  to  those  directly  engaged  in  applied  science  ami  engineer¬ 
ing-through  their  students,  through  discussions  of  all  kinds,  and 
through  seminars  such  as  this.  The  accomplishments  in  the  last  two 
categories  more  than  justify  the  support  that  basic  science  has 
received  from  the  mission-oriented  agencies. 

SPECIAL  AND  GENERAL  RELATIVITY 

The  word  relativity  has  two  distinct  meanings  in  physics.  The 
special  theory  of  relativity  is  the  successor  theory  to  Newtonian 
mechanics,  and  must  he  used  when  relative  velocities  are  comparable 
with  the  velocity  of  light.  The  general  theory  of  relativity  is  the 
successor  theory  to  Newtonian  gravitation,  and  must  be  used  when 
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gravitational  fields  are  strong.  Both  theories  are  the  work  of  Ein¬ 
stein,  the  special  theory  having  been  published  in  1905, 1  and  the 
general  theory  in  1916.2 

In  each  case,  the  extent  to  which  the  predictions  of  the  newer 
theory  differ  from  those  of  the  older  can  be  measured  by  a  dimen¬ 
sionless  parameter.  In  special  relativity,  the  corrections  to  Newtonian 
mechanics  are  of  order  (A v/cf,  where  Ay  is  the  difference  in 
velocity  between  the  experimenter  and  the  object  that  is  being 
observed,  and  c  is  the  speed  of  light.  In  general  relativity,  the 
corrections  to  Newtonian  gravitation  are  of  order  (A <t>/c2),  where 
A <l>  is  the  difference  in  gravitational  potential  between  the  positions 
of  experimenter  and  object,  or  between  different  positions  of  the 
object  during  its  motion.  For  example,  in  measurements  of  the 
gravitational  red  shift  in  the  earth’s  field,  A <t>  is  equal  to  the 
product  of  the  local  acceleration  of  gravity  g  and  the  difference  in 
height  h  of  the  points  between  which  the  red  shift  is  measured. 
Thus  the  fractional  change  in  wavelength  or  frequency  is  expected 
to  be  of  order  gh/c--,  it  actually  turns  out  to  be  just  equal  to  this. 
As  another  example,  the  motions  of  planets  or  light  rays  near  the 
sun,  or  of  satellites  near  the  earth,  depend  on  tire  difference  in  poten¬ 
tial  between  the  position  of  the  object  and  infinity.  This  A  $  is  equal 
to  GMjr,  where  G  is  the  Newtonian  gravitational  constant,  M  is  the 
mass  of  the  sun  or  the  earth,  as  the  case  may  be,  and  r  is  the  distance 
from  its  center  to  the  object.  Thus  general  relativistic  corrections  are 
expected  to  be  of  order  GAf  /r3r. 

The  magnitudes  of  these  dimensionless  parameters  in  typical 
cases  point,  up  the  great  difficulty  in  providing  experimental  evi¬ 
dence  for  general,  as  compared  with  special  relativity.  There  was  a 
time,  thirty  dr  forty  years  ago,  when  it  was  inq>os$ibte  to  perform 
experiments  in  which  (A  vjc)2  was  comparable  with  unity,  and  it 
was  difficult  at  that  time  to  obtain  quantitative  verification  of  the 
predictions  of  the  special  theory  of  relativity.  Those  days  have  long 
since  passed,  and  there  are  now  many  detailed  confirmations  of  the 
theory.  Indeed,  none  of  the  many  existing  high  energy  particle 
accelerators  would  operate  as  designed  if  special  relativity  were  not 
valid,  nor  would  the  kinematics  of  the  experiments  performed  with 
Utese  accelerators  be  intelligible.  Moreover,  the  union  of  electro¬ 
dynamics,  special  relativity,  and  quantum  mechanics  has  yielded  a 
number  of  results  Utat  are  in  agreement  with  very  precise  expert- 
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ments,  and  no  result  that  is  in  disagreement.  Thus  there  can  be  no 
reasonable  doubt  at  this  time  as  to  the  correctness  of  special  relativ¬ 
ity  within  its  domain  of  validity. 

While  it  is  now  an  everyday  matter  to  get  (A  v/cf  -  1,  the 
general  relativity  parameter  is  exceedingly  small,  and  nothing  can 
be  done  to  make  it  larger.  With  h  —  10  meters,  gh/c 2  ~  10‘18,  and 
at  the  surface  of  the  earth,  GM/c2r  ~  10-°.  At  the  surface  of  the 
sun  C,M jc-r  ~  10*e,  and  at  the  orbit  of  the  planet  Mercury,  this 
parameter  has  decreased  to  about  10‘8.  Thus  any  experiment  that  is 
directed  toward  confirming  or  disproving  general  relativity  theory 
must  be  arranged  so  as  to  measure  very  small  effects  with  very  great 
precision.  It  is  largely  because  experiments  of  this  kind  push  beyond 
the  limits  of  existing  technology  that  research  on  general  relativity 
is  contributing  to  advances  in  applied  science  and  engineering. 

It  has  sometimes  been  argued  that  experiments  involving  large 
accelerations,  such  as  are  produced  in  centrifuges,  can  provide  evi¬ 
dence  on  the  validity  of  general  relativity  theory.  In  actuality,  how¬ 
ever,  all  such  experiments  can  be  discussed  correctly  and  completely 
on  the  basis  of  special  relativity  theory,  since  permanent  gravita¬ 
tional  fields  produced  by  massive  objects  are  not  involved.  The  most 
cogent  demonstration  of  this  has  been  given  by  Slier  win."  He  showed 
that  the  enormous  accelerations  experienced  by  atomic  nuclei  that 
undergo  thermal  oscillations  in  solids,  which  are  of  order  10JBg, 
do  not  alter  the  agreement  between  observation  and  the  theoretical 
predictions  of  special  relativity. 

THE  "CLASSICAL”  TESTS  OF  GENERAL  RELATIVITY 

In  his  original  paper  on  general  relativity,  Einstein3  proposed 
three  experi mental  verifications  of  the  theory.  These  "classical" 
tests,  as  they  are  now  often  called,  consist  of  measurements  of  three 
phenomena;  the  gravitational  shift  to  longer  wavelengths  or 
lower  frequencies  (red  shift)  of  light  in  going  from  a  stronger  to  a 
weaker  gravitational  field;  the  deflection  of  starlight  that  passes 
through  the  strong  gravitational  field  close  to  the  surface  of  the  sun; 
and  tiie  slow-  rotation  of  the  major  axis,  or  precession  of  the  peri¬ 
helion,  of  the  elliptical  orbit  of  an  inner  planet,  Mercury  in  par¬ 
ticular.  Let  us  consider  each  of  these  in  turn. 

The  gravitational  red  shift  was  first  observed  iu  the  spectra  of 
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the  sun  and  two  white  dwarf  stars,  Sirius  B  and  40  Eridani  B.  How¬ 
ever  the  only  precise  measurements  have  been  performed  in  terres¬ 
trial  laboratories,  where  nuclear  Mossbauer  radiation  has  been  used 
over  vertical  distances  of  as  much  as  74  feet.4  The  theoretical  pre¬ 
diction  that  the  radiation  should  decrease  in  frequency  by  the  frac¬ 
tional  amount  A  v/v  =  gh/c2  in  travelling  upward  a  distance  h  has 
now  been  verified  with  an  accuracy  of  about  one  percent.  This 
prediction  was  first  made  by  Einstein  prior  to  his  development  of 
general  relativity,8  and  is  not  really  a  test  of  that  theory.  Rather  it 
is  a  confirmation  of  what  is  called  the  equivalence  principle.  This 
states  that  observations  made  in  a  laboratory  that  is  at  rest  in  a 
uniform  gravitational  field  of  strength  g  are  in  complete  agreement 
with  observations  made  in  a  laboratory  when  it  is  away  from  gravi¬ 
tational  fields  and  is  subjected  to  a  constant  acceleration  g.  There  is 
very  strong  evidence  of  other  kinds  for  the  precise  validity  of  the 
equivalence  principle. 

The  theoretical  prediction  for  the  red  shift  can  then  be  derived 
by  considering  the  two  laboratories  illustrated  in  Fig.  1.  The  one 


Figure  1.  Two  laboratories  in  which  identical  results  should  be  obtained, 
in  accordance  with  the  equivalence  principle.  On  the  left  is  the  actual 
laboratory  on  the  earth,  and  on  the  right  is  a  hypothetical  laboratory  that 
is  away  from  gravitational  fields  and  has  the  acceleration  g. 
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on  the  left  is  at  rest  on  the  surface  of  the  earth,  and  that  on  the 
right  is  away  from  the  earth  but  accelerating  upwards.  According 
to  the  equivalence  principle,  experiments  should  yield  the  same 
results  in  the  two  situations.  Thus  we  can  calculate  the  change  in 
frequency  of  the  radiation  emitted  by  the  source  S  and  detected 
at  the  receiver  R  by  using  the  arrangement  on  the  right.  Since  no 
gravitational  fields  are  involved  here,  general  relativity  need  not  be 
employed.  The  radiation  requires  a  time  t  —  hjc  to  make  the 
upward  trip,  and  during  this  time  the  accelerating  laboratory 
increases  its  upward  speed  by  A  v  —  gt  =  ghjc.  Thus  by  the  time 
R  receives  the  radiation,  it  is  receding  with  speed  /\v  from  S  when 
it  emitted  the  radiation.  It  thus  finds  that  the  frequency  is  decreased 
by  the  amount  Av  because  of  the  Doppler  effect,  where  Av/v  ~ 
A  v/c  =  gh/c 2. 

We  see  then  that  a  test  of  general  relativity  must  go  beyond  the 
gravitational  red  shift.  The  bending  of  light  rays  that  pass  through 
the  strong  gravitational  field  near  the  sun  provides  another  test  of 
the  theory.  Here  the  evidence  to  date  has  been  obtained  during 
total  eclipses,  when  stars  are  visible  fairly  close  to  the  limb  of  the 
sun.  These  observations  are  not  in  disagreement  with  the  theoiy, 
but  are  uncertain  by  roughly  20  to  30  percent,  and  so  cannot  be 
regarded  as  strong  support  for  it.  The  prediction  is  that  the  light 
ray  should  be  bent  towards  the  sun,  as  though  it  were  attracted  to 
it,  and  that  the  deflection  angle  should  be  4 Gm/c"r  radians,  where 
r  is  the  distance  of  closest  approach  of  the  light  •  ay  to  the  center  of 
the  sun.  This  is  equal  to  1.75 (R/r)  seconds  of  arc,  where  R  is  the 
solar  radius. 

There  is  an  interesting  and  instructive  way  of  viewing  the  situa¬ 
tion  described  in  the  last  paragraph,  Normally  one  thinks  of  the 
motions  of  light  rays  and  material  objects  as  occurring  with  respect 
to  an  inertial  coordinate  system  which  is  neither  accelerating  nor 
rotating.  In  such  a  coordinate  system,  light  rays  move  in  straight 
lines  with  constant  speed.  Now  let  us  suppose  that  the  gravitational 
field  produced  by,  say,  the  sun,  alters  this  situation  slightly  by 
imposing  a  new  direction  in  space  on  the  otherwise  isotropic  inertial 
system.  The  word  “slightly"  is  important,  because  as  the  light  ray 
moves  past  the  sun,  the  direction  of  the  gravitation  field  through 
which  it  is  passing  alters,  and  if  the  ray  were  to  maintain  a  constant 
angle  with  the  field  lines,  it  would,  of  course,  go  in  a  strongly 
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curved  path  around  the  sun.  We  assume  as  part  of  our  qualitative 
picture  that  the  inertial  coordinate  system  is  only  very  weakly 
coupled  to  the  local  direction  of  the  gravitational  field,  so  that  the 
light  ray  tends  to  keep  the  same  angle  with  the  field  but  fails 
almost  completely  to  do  so.  The  very  small  parameter  GMfch  is 
a  measure  of  its  success  in  following  the  field  lines.  The  net  result 
is  then  that  the  ray  moves  in  nearly  a  straight  line,  but  is  bent 
toward  the  sun  through  an  angle  that  is  of  order  GM  jc2r.  It  should 
be  emphasized  that  this  picture  merely  provides  a  way  of  thinking 
about  the  situation  once  a  careful  calculation  has  told  us  what  the 
answer  really  is.  However,  it  can  also  be  of  some  heuristic  value  in 
thinking  about  possible  new  experiments.  We  shall  return  below 
to  its  applicability  to  other  situations. 

The  third  of  the  “classical”  tests  of  general  relativity  is  the 
observation  of  the  slow  precession  of  the  perihelion,  or  point  of 
closest  approach  to  the  sun,  of  the  orbits  of  the  inner  planets. 
Newtonian  theory  predicts  that  each  planetary  orbit  is  a  closed  elipse 
with  the  sun  at  one  focus.  However,  perturbations  caused  by  other 
planets,  which  can  be  calculated  with  Newtonian  theory,  cause 
these  ellipses  to  rotate  slowly.  The  orbit  of  Mercury,  which  is  closest 
to  the  sun  and  hence  the  best  candidate  for  a  test  of  general  relativ¬ 
ity  theory,  is  affected  most  by  Venus  (since  it  is  closest)  and 
Jupiter  (since  it  is  most  massive).  These  perturbations  can  be  calcu¬ 
lated  quite  reliably;  when  this  is  done,  it  is  found  that  about  a 
tenth  of  the  observed  precession  cannot  be  accounted  for  in  this 
way.  General  relativity  predicts  that  there  should  be  a  precession  of 
birGM  jc-r  radians  during  each  orbital  revolution  of  88  days,  where 
r  is  the  mean  radius  of  the  orbit.  This  comes  to  about  43  seconds 
of  arc  per  century.  In  spite  of  its  smallness,  this  angle  has  been 
measured  with  an  accuracy  of  about  one  percent,  and  agrees  with 
the  theoretical  prediction.  We  shall  return  later  to  some  recent 
observations  which  cast  doubt  on  this  agreement. 

The  same  qualitative  picture  that  was  used  in  discussing  the 
deflection  of  starlight  can  profitably  be  employed  here.  We  imagine 
that,  as  Mercury  goes  about  the  sun,  it  attempts  to  maintain  an 
elliptical  orbit  with  respect  to  a  coordinate  system  that  is  weakly 
coupled  to  the  changing  direction  of  the  gravitational  field  lines. 
Again,  the  measure  of  its  success  in  adapting  its  orbit  to  this  con¬ 
tinually  changing  direction  is  the  parameter  GM  jch\  This  tells 
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us  then  that  the  ellipse  rotates  slowly  in  the  same  direction  as  the 
planet  rotates  in  its  orbit,  and  that  the  angle  of  rotation  per  revolu¬ 
tion  is  of  order  GMjc2r. 

THE  METRIC  TENSOR 

Any  more  quantitative  discussion  of  the  extent  to  which  the 
observations  confirm  or  disprove  general  relativity  must  bring  in  the 
idea  of  a  metric.  Einstein’s  theory  of  gravitation  is  geometrical  in 
nature,  and  relies  on  the  idea  that  space  and  time  measurements 
are  distorted  by  the  presence  of  a  massive  body.  This  distortion,  or 
space-time  curvature  as  it  is  often  called,  causes  corresponding 
changes  in  the  familiar  paths  followed  by  light  rays  or  planets.  For 
example,  we  normally  think  of  a  light  ray  as  travelling  in  a  straight 
line  in  the  sense  that  it  takes  the  shortest  path  from  one  point  to 
another.  But  in  curved  space-time,  the  shortest  path  is  generally  not 
a  straight  line,  so  that  light  rays  are  deflected  when  they  pass  close 
to  the  sun. 

I'he  metric  tensor,  or  simply  the  metric,  is  a  quantitative  measure 
of  this  curvature.  It  is  easily  illustrated  in  two-dimensional  space 
by  comparing  the  geometry  of  a  plane  with  that  of  the  surface  of 
a  sphere.  In  a  plane,  two  points  that  have  x  coordinates  that  differ 
by  dx,  and  y  coordinates  that  differ  by  dy,  are  separated  by  a  distance 
dir,  where  the  expression  da  =  dx*  4-  dy*  is  called  the  metric  that 
corres{M)nds  to  a  plane.  However  we  do  not  have  to  use  the 
rectangular  coordinates  x\y;  in  jrolar  coordinates,  where  p  is  the 
radial  and  <t>  the  angular  coordinate,  the  metric  is  da*  ~  dp*  -f 
This  change  of  coordinate  system  and  metric  tloes  not  change 
the  nature  of  the  geometry,  which  is  still  the  flat  geometry  charac¬ 
teristic  of  a  plane.  Supjrose  now  that  we  replace  this  last  expression 
by  da*  =  dp*  -f-  sin3p  d$*,  which  is  practically  the  same  as  the 
earlier  expression  when  p  is  very  small.  This  changes  the  geometry 
from  flat  to  curved,  since  dtr  is  now  the  distance  between  two  points 
measured  on  the  surface  of  a  sphere  of  unit  radius.  To  see  this,  we 
need  merely  note  that  on  a  sphere  of  radius  R,  d<r2  =  RW  4- 
R^in'^  d$a,  where  0,  are  polar  coordinates.  Then  setting  li  =  1 
and  6  —  p  gives  us  our  earlier  expression.  It  is  always  |>ossible  to 
tell  from  the  form  of  tlte  metric  whether  the  geometry  is  flat  or 
curved. 
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It  follows  then  that  the,  concept  of  curvature  can  he  expressed 
quantitatively  by,  means  of  a  metric.  In  flat  space-time,  that  is,  in 
the.  absence  of  gravitating  bodies,  the  metric,  has,, the  form  ds2  = 
d<?  -^.(1  /c2)  (dx?  4:  •+■  dz2).  The  path  of,  a  light  ray  is  described 

hy  the  equation  ds  =  0,  which  means  that  da/dt  ,=.  c,  where  now 
do2  =  dx2  dy‘-  -jrdz2.,We  see  then  that,  such,  a,  path  is  a  straight 
line  and  that  the  speed  is  constant  and  equal  to  e.  In  order  to  take 
curvature  into  account,  .we  must,  alter,  the  coefficients  of  dt2  and  dvr 
from  unity.  Siqce,  die  parameter  GMjc2r  h  always  small  in  practical 
cases,,  it  is  natural  to  express  these  coefficients  as  power  series  in  this, 
parameter.  We  thus  arrive  at  a  general  form  for  the  metric, 
ds2  =  dt2  [1  _  2a(GAf/c2r)  +  2fl(GM/c2rV* ,+  ...] 
-(1/c2)  (dx2  4-  df  4.  dz2)  [1  4.  2 y(GM/c2r)  +  ...] 
where  the  coefficients  of  dt2,  dx2,  etc.  are  :the  components  of  the 
metric  tensor.  . 

The  two  series  have  been  written  in  this  way  in  order  to  facilitate 
comparison  with  Einstein’s  theory,  which  predicts  that  the  numbers 
a,  (i,  y.  ...  are  all  equal  to  unity.  Thus  we  can  determine  the  values 
of  these  numbers  from  the  observations,  and  then  express  any 
possible  discrepancy  from  the  general  relativity  prediction  in  terms 
of  deviations  of  these  numbers  from  unity.  We  note  first,  however, 
that  we  can  choose  the  number  a  equal  to  unity  without  any  loss 
of  generality,  since  any  other  value  would  simply  correspond  to  a 
different  choice  for  the  value  of  G;  it  is  really  the  magnitude  of 
the  product  «G  that  determines  the  Newtonian  elliptical  orbits. 

For  the  met  ion  of  a  light  ray,  we  have  already  seen  that  dt  is 
equal  to  dw/f  in  the  absence  of  a  massive  body,  so  that  the  two 
quantities  are  approximately  equal  even  when  gravitation  is  present. 
Thus  we  expect  tliat  the  u  and  y  terms  both  contribute  to  the 
deflection  of  light  in  lowest  order,  and  that  the  term  will  be  of 
higher  order.  A  careful  calculation  shows  that  the  deflection  of  light 
is  proportional  to  the  combination  u  -f  v.  or  1  4-  y  since  we  have 
chosen  is  1.  The  perihelion  precession  of  a  planet  is  a  correction 
to  the  Newtonian  orbit,  which  is  determined  by  the  a  term.  In  this 
case,  df  and  d@/c  are  not  of  the  same  order  of  magnitude.  To  see 
tins,  we  consider  a  circular  orbit  of  a  planet  of  mass  in,  for  which 
the  centrifugal  force  mtP/r  is  equal  to  the  gravitational  attraction 
Gmdf/r3,  where  r  is  the  radius  of  the  orbit.  In  this  case  w*  =  GM(v, 
and  this  will  be  true  in  order  of  magnitude  even  if  the  orbit  is  not 
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circular.  Then  the  ratio  of  the  f}  term  to  the  y  term  is  ftfidt^GM  j 
c2r)2]/[(2v^ff2 A2)(GM /f 2r)] = (fidt2/Yd(^)(GM /r),  which  is  roughly 
equal  to  /?/y  since  dt2/d(r  is  equal  to  \/v-.  Since  this  ratio  is  of 
order  unity,  the  correction  to  the  Newtonian  orbit  of  a  planet  in¬ 
volves  the  p  term  as  well  as  the  y  term.  It  turns  out  that  the  peri¬ 
helion  precession  is  proportional  to  2(1  -j-  y)  —  0. 

It  follows  that  the  second  of  the  “classical  tests'*  of  general  rela¬ 
tivity  should  determine  the  parameter  y,  and  the  third  would  then 
determine  (J.  At  present,  however,  the  determination  of  the  deflec¬ 
tion  of  light  is  sufficiently  imprecise  so  that  y  is  not  well  determined 
at  all;  it  is  only  the  combination  2(1  -f  y)  —  (i  dial  is  fixed  by  die 
planetary  orbit  precession. 

OBLATENESS  OF  THE  SUN 

In  a  recent  publication,1 0  Dickc  and  Coldenberg  reported  their 
observation  that  the  sun  is  slightly  oblate:  the  equatorial  diameter 
exceeds  the  polar  diameter  by  the  fractional  amount  (5  ±  0.7)  X 
10  s.  They  interpret  this  observation  as  an  oblatcness  of  the  mass 
distribution  of  the  sun  as  well  as  of  the  distribution  of  visible  bright¬ 
ness.  It  should  lie  noted  that  if  the  sun  were  to  rotate  with  a  con¬ 
stant  angular  velocity  equal  to  the  average  of  that  observed  on 
different  parts  of  the  surface  (period  of  about  25  days),  centrifugal 
effects  would  produce  an  obiatencss  less  than  one-quarter  of  the 
above  value.  They  therefore  assume  that  the  obiatencss  is  caused  by 
a  core  that  rotates  very  rapidly,  with  a  period  of  a  day  or  two. 

The  equatorial  bulge  in  the  mass  distribution  that  is  caused  by  , 

the  rapidly  rotating  core  affects  the  orbits  of  the  planets  in  an  cn- 
tirely  Newtonian  way;  the  relativistic  effects  of  the  bulge  are  com¬ 
pletely  negligible.  The  Newtonian  gravitational  force  law  deviates 
slightly  from  the  inverse  square  form  that  is  characteristic  of  a  :■ 

spherical  mass  distribution,  and  causes  a  precession  of  the  perihelion 
of  Mercury’s  orbit  of  3.4  seconds  of  arc  per  century.  The  remainder  | 

of  the  observed  43  seconds  is  presumably  of  relativistic  origin,  but  | 

is  evidently  less  than  the  Einstein  prediction,  which  as  noted  above  § 

is  also  43  seconds.  On  this  interpretation,  the  Einstein  theory  is 
incorrect,  or  at  best  incomplete.  An  alternate  theory,  proposed  I' 

several  years  ago  by  Brans  and  DickeT  as  a  development  of  an  earlier  ?f 

theory  of  Jordan,  assumes  dial  gravitation  is  described  not  only  by  < 
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the  metric  tensor  discussed  above,  but  also  by  a  scalar  field.  This 
tensor-scalar  theory  contains  an  arbitrary  parameter,  which  is  es¬ 
sentially  the  ratio  of  the  magnitude  of  the  tensor  to  the  scalar  con¬ 
tribution.  This  parameter  can  be  chosen  so  as  to  reduce  the  relativis¬ 
tic  perihelion  precession  by  the  8  percent  suggested  by  the  equatorial 
bulge.  The  deflection  of  starlight  should  then  be  about  6  percent 
less  than  the  Einstein  value,  but  as  noted  above  this  is  beyond  the 
present  precision  of  the  observations. 

The  foregoing  interpretation  of  the  oblateness  of  the  visible  sun 
in  terms  of  a  corresponding  oblateness  of  the  mass  distribution  has 
been  questioned  in  several  recent  papers.  Roxburgh8  has  suggested 
that  the  differential  Coriolis  force  associated  with  the  somewhat 
greater  angular  velocity  of  the  visible  surface  of  the  sun  at  .he 
equator  as  compared  with  the  poles,  could  produce  an  apparent 
equatorial  bulge  equal  to  that  observed  without  a  significant  de¬ 
parture  of  the  mass  distribution  from  spherical  symmetry.  Goldreich 
and  Schubert9  have  argued  that  the  very  great  differential  rotation 
rate  between  the  core  and  the  surface  of  the  sun,  which  was  postu¬ 
lated  by  Dickc  and  Goldenberg,  is  unstable  and  would  lead  to  com¬ 
plete  mixing  in  a  very  short  time.  Most  recently,  Sturrock  and  Gil- 
varry10  have  remarked  that  the  solar  magnetic  field  could  play  an 
important  part  in  determining  the  visible  shape  of  the  sun,  so  that 
there  may  be  a  substantial  difference  between  the  apparent  equa¬ 
torial  bulge  and  the  mass  distribution. 

At  the  present  time  we  cannot  conclude  that  the  observed  solar 
oblateness  invalidates  general  relativity  theory.  On  the  contrary,  in 
view  of  the  arbitrariness  of  the  tensor-scalar  ratio  parameter  in  the 
Brans-Dicke  theory  and  the  difficulties  inherent  in  Dicke  and  Gol- 
denberg’s  interpretation  of  their  observations,  it  seems  most  reason¬ 
able  to  assume  for  the  present  that  the  Einstein  theory  is  correct. 
We  discuss  below  other  lines  of  experimentation  which  will  in 
dine  provide  independent  evidence  on  the  validity  of  general  rela¬ 
tivity  theory. 

OTHER  TESTS  OF  GENERAL  RELATIVITY 

The  paucity  of  experimental  evidence  on  the  validity  of  general 
relativity  has  led  to  several  suggestions  for  new  experiments,  all  of 
which  arc  now  being  implemented.  T ‘wo  of  these,  proposed  in  1960, 
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involve  the  generation  and  detection  of  gravitational  waves,11  and 
the  measurement  of  the  precession  of  the  spin  axis  of  an  orbiting 
gyroscope.1 3  The  first  of  these  iias  produced  no  definitive  results  as 
yet;  the  second,  also  in  progress,  will  be  discussed  in  more  detail  in 
the  next  two  sections. 

More  recently,  it  has  been  pointed  out  that  the  transit  times  of 
radar  signals  that  pass  through  the  strong  gravitational  field  near 
the  sun  will  be  affected  by  general  relativity.13  It  is  projroscd  that 
the  time  interval  between  emission  of  a  pulse  from  the  earth  and 
reception  of  the  reflected  pulse  from  Venus  or  Mercury  be  measured 
as  the  planet  passes  on  the  opposite  side  of  the  sun'  from  the  earth. 
Again,  rto  results  are  yet  available.  A  modification  of  this  experi- 
ment,  on  which  work  has  yet  to  be  started,  is  intended  to  circumvent 
the  problems  associated  with  uncertainties  in  planetary  radii  and 
topography.  It  would  consist  in  substituting  for  the  planet  as  a 
radar  target,  a  space  vehicle  that  contains  a  radar  transponder  and  is 
placed  in  orbit  about  the  planet.14  '' 

PRECESSION  OF  THE  SPIN  AXIS  OF  A  GYROSCOPE 

In  Newtonian  theory,  the  spin  axis  of  a  torque-free  spherical 
gyroscope  remains  fixed  in  space.  All  of’tUe  qualifying  adjectives  in 
the  preceding  sentence  arc  important;  Newtonian,  torque-free,  and 
s|dtet  real.  The  gyroscope  will,  of  course,  process  if  it  is  subject  to  a 
torque.  It  will  also  process  if  it  is  in  an  inhomogeneous  gravitational 
field  unless  it  is  spherical.  Suppose,  for  example,  that  tire  gyroscope 
has  an  equatorial  bulge  and  is  oriented  with  its  spin  axis  at  45°  to 
Ute  vertical  in  the  earth's  gravitational  field.  Because  of  the  diver¬ 
gence  of  die  gravitational  field  lines  of  tire  earth  and  the  consequent 
weakening  of  gravitational  pull  with  increasing  altitude,  there  will 
be  a  stronger  downward  force  on  the  side  of  the  gyroscope  equator 
drat  is  momentarily  lower  than  the  op|H»site  side.  This  will  cause  a 
torque  to  be  exerted  on  the  gyroscope  in  such  a  seme  as  to  tend  to 
increase  dre  angle  between  the  spin  axis  and  the  vertical.  Such  a 
gravity-gradient  torque  will,  of  course,  make  the  spin  axis  process. 
However  it  can  be  drown  drat  no  matter  how  inhomogeneous  the 
earth's  field  may  be,  for  example  owing  to  the  presence  of  moun¬ 
tains,  there  is  no  torque  If  the  gyroscope  is  accurately  spherical. 

Even  tor  a  torque-free  spltericai  gyroscope,  precession  is  absent 
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only  in  Newtonian  theory.  Both  special  and  general  relativity  pro¬ 
duce  changes  in  the  direction  of  the  spin  axis.  The  special  relativity 
effect  is  called  the  Thomas  precession,  and  was  first  discovered  in 
connection  with  atomic  systems.  However,  since  special  relativity  is 
well  established  in  other  ways,  our  interest  center's  on  the  general 
relativity  effect,  which  consists  of  two  parts.  The  larger  (tart,  called 
die  geodetic  precession,  is* proportional  to  the  amount  by  which  the 
gyroscope  moves  through  the  gravitational  field  of  the  earth,  and  is 
absent  if  the  gyroscope  is  stationary.  This  does  not,  however,  mean 
that  there  is  no  geodetic  precession  if  the  gyroscope  is  at  rest  on  the 
surface  of  the  earth,  since  the  daily  rotation  of  the  earth  carries  the 
gyroscope  with  it  through  the  earth's  field.  In  this  case  the  geodetic 
precession  amounts  to  about  0.4  seconds  of  arc  pe  year.  The  effect 
can  be  greatly  increased  by  placing  the  gyroscope  in  a  satellite,  since 
at  moderate  altitude  the  satellite  will  go  around  the  earth  some 
sixteen  times  a  day:  the  geodetic  precession  is  then  about  7  seconds 
of  arc  per  year. 

The  smaller  but  more  interesting  general  relativity  effect  is  called 
the  motional  precession.  It  arises  from  the  fact  that  the  earth  is  in 
daily  rotation  so  that  the  mass  of  the  earth  is  in  continual  motion. 
The  Newtonian  gravitational  field  produced  by  the  earth  is  the  same 
whether  the  earth  is  at  rest  or  in  rotation.  But  the  Einstein  theoiy 
predicts  a  significant  difference  between  the  two.  An  instructive 
analogy  may  be  drawn  between  this  situation  and  a  somewhat 
similar  electromagnetic  arrangement.  Suppose  that  the  earth  is  re¬ 
placed  by  a  dielectric  sphere  that  has  positive  electric  charge  div 
tributed  through  its  volume,  and  the  gyroscope  is  replaced  by  a 
small  bar  magnet.  If  the  sphere  and  the  magnet  ate  both  at  rest,  no 
torque  is  exerted  on  the  magnet  If  the  spitere  is  at  rest  hut  the 
magnet  moves  around  it,  then  the  electric  field  produced  by  the 
sphere  will  appear  to  the  moving  magnet  to  be  associated  with  a 
small  magnetic  field,  and  this  will  exert  a  torque  on  the  magnet. 
This  is  analogous  to  the  geodetic  precession  in  the  gravitational 
ease.  Finally,  if  the  sphere  is rotating  on  its  axis,  the  circulating  elec¬ 
tric  current  of  the  moving  charges  wit)  produce  an  externa)  magnetic 
field,  and  this  will  exert  a  torque  on  the  magnet  whether  or  not  the 
magnet  is  at  rest.  Hits  is  analogous  to  the  motional  precession  in 
the  gravitational  ease. 

General  relativity  predicts  motional  precession  whether  or  not 
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the  gyroscope  is  moving.  Near  the  surface  of  the  earth  it  is  of  the 
order  of  a  tenth  of  a  second  of  arc  per  year.  The  reason  it  is  of  great 
interest  in  spite  of  its  smallness  is  that  it  is  the  only  manifestation 
now  known  of  the  influence  of  mass  motion  on  gravitation.  This  is 
the  reason  that  very  great  effort  is  being  directed  toward  the  highest 
possible  precision  in  measuring  the  gyroscope  precession. 

A  detailed  calculation  shows  that  the  angular  momentum  vector 
S  of  the  gyroscope  changes  with  time  in  the  following  way:1* 


1'he  first  equation  states  that  the  change  in  S  is  at  right  angles  to 
S  itself;  thus  S  does  not  change  in  magnitude,  but  preccsses  about 
the  direction  of  the  vector  (I  at  a  rate  equal  to  the  magnitude  of  n 
in  radians  per  second.  The  second  equation  is  an  expression  for  the 
precession  angular  velocity  Q.  The  first  term  is  the  Thomas  preces¬ 
sion,  and  depends  on  the  nongravitational  force  F  exerted  on  the 
gyroscope,  and  on  the  gyroscope’s  mass  m  and  velocity  v.  Since  this 
term  does  not  involve  the  ntass  Af  of  the  earth  or  the  Newtonian 
gravitational  constant  G,  it  cannot  arise  from  general  relativity,  and 
as  remarked  above  is  of  special  relativistic  origin  The  second  term 
is  the  geodetic  precession,  and  depends  bodi  on  v  and  on  the  vector 
position  r  of  the  gyroscope  with  respect  to  the  center  of  the  earth. 
The  third  term  is  the  motional  precession,  and  involves  the  moment 
of  inertia  /  and  the  angular  velocity  vector  w  of  the  earth.  It 
evidently  arises  from  rotation  of  the  earth,  and  is  present  whether 
or  not  the  gyroscojre  is  in  motion  since  it  is  independent  of  v. 

The  experiment  now  under  way  is  described  in  the  next  section, 
and  will  be  performed  in  a  satellite.  The  main  reason  for  choosing 
an  orbiting  rather  than  a  laboratory  gyroscope  is  that  the  former  is 
in  tree  fall  and  hence  docs  not  have  to  Ire  supported  against  gravity. 
This  means  that  the  inevitable  small  misalignment  of  center  of 
mass  and  center  of  sup'xirt  of  the  gyroscope  does  not  exert  a  torque 
which  could  confuse  the  observations  if  it  were  present.  A  secondary 
reason  is  of  course  that  the  geodetic  precession  is  mud*  larger  in 
orbit  titan  on  die  surface  of  the  earth;  on  the  other  band,  the 
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motional  precession  is  roughly  the  same  in  the  two  cases.  Further, 
the  relatively  uninteresting  Thomas  precession  is  zero  in  orbit,  since 
F  =  0  there. 

Both  the  geodetic  and  motional  precessions  can  be  viewed  in 
terms  of  the  qualitative  picture  described  earlier  in  this  paper.  We 
imagine  that  as  the  gyroscope  goes  about  the  earth,  it  attempts  to 
maintain  a  fixed  direction  for  its  spin  axis  with  respect  to  a  coordi¬ 
nate  system  that  is  weakly  coupled  to  the  changing  direction  of  the 
gravitational  field  lines.  The  measure  of  its  success  in  adapting  its 
spin  axis  to  this  continually  changing  direction  is  the  parameter 
GM  jc'2r.  This  tells  us  that  the  spin  axis  rotates  slowly  in  the  same 
direction  as  the  gyroscope  rotates  in  its  orbit,  and  that  the  angle  of 
rotation  per  revolution  is  of  order  GM/c-r.  Actually,  the  geodetic 
(second)  term  in  the  expression  for  0  shows  that  for  a  circular  orbit 
of  radius  r,  r  X  v  integrates  to  2 »r  over  one  revolution,  so  dial  the 
geodetic  precession  is  3wGM  jc-r  radians  per  revolution. 

The  same  picture  can  be  applied  to  the  motional  precession. 
Imagine  first  a  gyroscope  that  is  fixed  over  one  of  the  ]>oles.  As  the 
earth  rotates,  it  tends  to  “drag”  its  gravitational  field  with  it,  and 
this  tends  to  make  the  spin  axis  of  the  gyroscope  rotate  in  the  same 
direction  as  <u.  In  this  situation,  r  is  parallel  to  m.  and  the  square 
bracket  term  in  O  i»  equal  to  2«,  in  agreement  with  the  picture. 
On  the  other  hand,  if  the  gyroscope  is  fixed  over  the  equator,  r  is 
perpendicular  to  *>,  and  the  square  bracket  term  is  equal  to  — 
Here  the  qualitative  picture  tells  us  that  the  earth's  field  is  getting 
weaker  as  it  extends  out  from  the  earth,  so  that  the  dragging  effect 
is  more  pronounced  on  the  side  of  the  gyroscojie  toward  the  earth 
than  it  is  on  the  side  away  from  the  earth.  Thus  the  gyroscope  spin 
axis  tends  to  rotate  in  die  opposite  direction  from  the  earth,  in 
agreement  with  the  calculation. 

The  notional  precession  cannot  be  expressed  in  terms  of  the 
jiarameters  (i  and  y  introduced  earlier,  since  the  metric  tensor 
needed  to  account  for  earth  rotation  is  more  complicated  than  that 
used  before  for  the  uuurotatiug  earth  or  sun.  However  the  geodetic 
precession  can  be  calculated  in  this  way,  and  it  turns  out  that  the 
number  S  in  the  numerator  of  the  second  term  of  f)  is  replaced  by 
l  4.  2y  I  luts  measurement  of  the  geodetic  precession  is  somewhat 
more  sensitive  to  the  value  of  y  than  measurement  of  the  deflection 
of  starlight. 
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THE  GYROSCOPE  EXPERIMENT 

The  gyroscope  experiment  is  being  performed  in  die  Stanford 
University  Physics  Department  by  Fairbank  and  Everitt.18  The 
gyroscope  itself  will  consist  of  a  very  homogeneous  quartz  sphere, 
ly2  inches  in  diameter,  and  coated  with  a  thin  layer  of  niobium. 
This  sphere  will  be  supported  in  vacuum  by  electrostatic  forces, 
which  will  have  to  overcome  only  the  very  small  differential  between 
earth  gravity  and  the  acceleration  of  the  satellite.  The  satellite  will 
not  be  quite  in  free  fall,  because  of  aerodynamic  drag,  solar  radia¬ 
tion  pressure,  etc.,  but  the  difference  should  not  be  of  greater  order 
of  magnitude  than  10*7g.  Once  in  orbit,  the  gyroscope  sphere  will 
be  spun  up  to  speed  by  helium  gas  jets,  and  kept  cold  enough  so 
that  the  niobium  coating  is  superconducting. 

The  main  reason  for  operating  at  liquid  helium  temperature  is 
that  a  rotating  superconductor  possesses  a  magnetic  moment  that  is 
accurately  aligned  with  its  axis  of  rotation,  and  the  measurement 
of  this  moment  therefore  makes  possible  a  precise  determination  of 
the  direction  of  the  gyroscope  spin  axis.  This  effect  was  predicted 
theoretically  by  London  in  1950, 16  and  observed  experimentally  in 
1964.1T  The  London  moment  corresponds  to  a  magnetic  field  along 
die  spin  axis  of  10-Tu>  gauss,  where  <u  is  the  angular  velocity  of  rota¬ 
tion  in  radians  jx:r  second.  It  will  be  detected  by  a  superconducting 
loop  that  encircles  the  gyroscope,  as  indicated  in  Fig.  2.  Changes  in 
die  magnetic  flux  that  links  this  detector  loop  will  be  measured  by 
a  modulator  that  is  part  of  the  same  superconducting  circuit,  as 
illustrated  schematically  in  Fig.  3. 

Figure  4  is  a  sketch  of  the  probable  configuration  of  the  apparatus, 
which  contains  a  telesco|ie  of  4  inches  aperture  and  four  gyroscojies. 
The  telescope  will  be  aligned  on  a  star  and  provide  the  reference 
direction.  There  will  of  course  be  corrections  for  aberrations  that 
arise  from  the  motion  of  the  satellite  around  the  earth  and  of  the 
earth  around  die  sun;  these  are  easily  calculated,  and  will  provide 
convenient  checks  on  the  {lerfonuance  of  the  telcsco|ie  and  London 
moment  read-out.  A  secondary  reason  for  low  tciu|>eratute  opera¬ 
tion  is  the  extreme  constancy  and  uniformity  of  the  teinjx'rature  of 
die  entire  apparatus*  and  the  consequent  freedom  from  thermal 
distortion.  It  is  exacted  dtat  angular  measurements  with  accuracy 
exceeding  0.01  seconds  of  are  can  be  made  over  the  course  of  a  year. 
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Fttnmr,  1  Sthentatk  illmtratiuvi  of  the  vupercomlueting  detection  tireuit. 
The  loop  to  the  left  encircles  the  gyroscope.  anti  the  arrow  indicates  that 
the  inductance  of  the  right-hand  part  of  the  superconducting  circuit  tan 
he  varied  periodically  to  provide  a  signal  'hat  is  proportional  to  the  flux 
linked  by  the  detector  loop. 
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It  is  planned  to  place  the  apparatus  in  polar  orbit,  as  shown  in 
Fig.  5.  Two  of  the  gyros  will  have  their  spin  axes  in  the  plane  of 
the  orbit  and  parallel  to  the  earth’s  rotation  axis  (gyro  1  in  the 
figure),  and  two  will  be  oriented  perpendicular  to  the  plane  of  the 
orbit  (gyro  2).  The  first  pair  will  be  sensitive  to  the  geodetic  pre¬ 
cession  alone,  and  the  second  pair  will  be  sensitive  to  the  motional 
precession  alone.  Reasonable  estimates  of  the  rate  of  loss  of  liquid 
helium  indicate  that  readings  should  be  obtainable  during  most  or 
all  of  the  first  year  after  launch. 

As  remarked  at  the  beginning,  experiments  on  general  relativity 
push  beyond  the  limits  of  existing  technology.  The  gyroscope  experi¬ 
ment  provides  an  excellent  example  of  progress  in  low  temperature 
engineering  and  magnetometry  that  was  stimulated  by  research  in 
basic  science. 
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VII.  Toroidal  Plasma  Confinement 
For  Fusion  , 

'  ••  '  Melvin  B.  Gottlieb  •  ■ 1  !  • 

It  was  just  about  fifteen  years  ago  that  the  effort  to  “tame  the 
hydrogen  bomb"  started' in  the  United' Stated,  the  United  Kingdom 
and  the  Soviet  Union.  It  was,  of  course,  Well  knowrt  that  the  vast 
amounts  of  deuterium  in  the  bceans  of  the  world  represented  an 
almost-  unlimited  potential  source  of  energy  through  the' D  —  D 
reaction.  •  ■  ••••  > 

'  '  r  '  !:!  •  >  T  +  p  +  4MeV 

’  '  I)  + 1) 

He8  +  n  4-  3.3  MeV  : 

The  I)  —  T  reaction  : 

D  4-  T  ->  He4  4-  n  4-  17-6  MeV 

has  similar  potential— in  this  case  the  basic  fuels  become  deuterium 
and  lithium,  the  latter  being  used  to  provide  tritium  through  an 
(n,  T)  reaction,  The  hydrogen  bomb  shows  that  this  energy  may  be 
released  explosively— but  as  yet  the  feasibility  of  a  controlled  thermo¬ 
nuclear  reactor  remains  to  be  demonstrated. 

At  the  outset  it  was  recognized  that  in  order  to  achieve  a  reason¬ 
able  reaction  rate  the  materials  would  have  to  be  brought  to 
extremely  high  tempera tures-of  the  order  of  108oK;  that  at  such 
high  temperatures  the  atoms  would  all  be  ionized,  thus  forming  a 
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plasma;  and  that  the  use  of  magnetic  fields  offered  the  possibility 
of  providing  insulation  between  the  hot  plasma  and  the  cold  walls. 

I  would  like  to  give  my  own  view  of  the  present  state  of  the  work 
on  the  containment  of  a  hot  plasma  by  a  magnetic  field. 

First,  I  will  describe  some  charged-particle  trajectories  in  rela¬ 
tively  simple  magnetic  and  electric  fields,  and  then  show  how  these 
considerations  affect  a  few  of  the  many  field  geometries  that  have 
been  proposed,  in  therms  of  the  equilibrium  and  stability  of  a 
plasma  confined  in  a  magnetic  field. 

Let  us  start  with  the  simple  case  of  a  particle  moving  at  right 
angles  to  a  constant  magnetic  field.  It  executes  a  circle  of  radius 
ntVjC/eB,  m,  vj_.  and  e  being  the  mass,  velocity,  and  charge  of  the 
particle,  B  the  field  strength,  and  c  the  velocity  of  light.  If  the  initial 
velocity  is  not  perpendicular  to  the  field,  then  the  motion  along  the 
field,V||,  is  unaffected;  the  motion  perpendicular  is  the  circle  as 
before— the  total  motion  being  a  spiral  around  the  field  lines.  If 
the  magnetic  field  intensity  varies  slightly  about  the  orbit,  then  a 
drift  takes  place  as  shown  in  Fig.  1— a  case  where  the  magnetic  field 
is  slightly  stronger  at  the  top  than  at  the  bottom.  Note  that  the  -f 
and  —  particles  (ions  and  electrons)  drift  in  opposite  directions. 
Suppose  an  electric  field  is  applied  perpendicular  to  a  uniform  B 
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OH, a  moving  charged  particle  is  always  perpendicular  to  B.In  the 
region  where  the  lines  of  force  are  converging,  there  is  a  small 
component  of  this  force  urging  the  particle  toward  the.woakey  field. 
This  fqrce  may  be  large  enough  to  prevent  the  particle  from  escap¬ 
ing  out  the  end;  a  reflection  occurs.  Of  course,  a  particle  moving 
mainly  parallel  to  B  is  relatively  unaffected.  Indeed,  it  turns  out 
that  only  particles  in  the  central  region,  with  velocity  vectors  lying 
within  a  cone  of  generating  angle  6  —  csc4R1/z,  where  R  (termed 
the  mirror  ratio)  is.  the  ratio  of  the  magnetic  field  magnitude  in  the 
mirror  to  that  at  the  central  plane,  escape  through  the  mirrors.  The 
others  are  reflected— they  are  contained.  These  particles  will  be  lost 
only  if  a  collision  or  other  scattering  process  occurs  which  changes 
the  velocity  vectors  to  one  within  the  escape  cone. 

Can  these  ends  be  avoided  completely  by  closing  the  lines  in  the 
form  of  a  torus?  (See  Fig.  4.)  The  field  in  a  torus  falls  off  as  1/r, 
where  r  is  the  distance  from  the  toroidal  axis.  The  field  is  non- 
uniform  and,  as  shown  previously,  ions  and  electrons  drift  in  oppo¬ 
site  directions.  An  electric  field,  transverse  to  B,  results  from  this 
charge  separation.  Particles  are  inhibited  by  the  magnetic  field  from 
flowing  in  such  a  direction  as  to  neutralize  this  charge  accumulation. 
In  this  crossed  electric  and  magnetic  field  situation  the  plasma 
drifts  rapidly  outward  in  the  manner  suggested  by  Fig.  2. 

Thus  we  say  that  a  plasma  in  a  simple  toroidal  magnetic  field 
does  not  possess  an  equilibrium.  Let  me  differentiate  this  from  the 
matter  of  stability.  To  find  out  whether  a  confined  plasma  is  stable, 
we  apply  a  small  perturbation  and  see  whether  this  perturbation 
grows  in  amplitude.  In  the  toroidal  case  just  described,  no  perturba¬ 
tion  was  applied;  still  electric  fields  were  developed  within  the 
plasma  which  would  drive  the  plasma  outward.  For  an  equilibrium 
to  exist  we  demand  that  a  steady  state  exist,  exhibiting  no  growing 
flows  or  electric  fields.  This  concept  of  equilibrium  involves  a  time 
scale.  Certainly  a  plasma  contained  by  a  magnetic  field  is  not  in 
thermodynamic  equilibrium,  and  therefore  will  not  remain  indefi¬ 
nitely  in  the  steady  state  wc  have  envisaged.  This  point  will  be 
amplified  later. 

Are  there  magnetic  configurations  with  closed  lines— toroids,  top¬ 
ologically  speaking- that  do  [tosses*  an  equilibrium?  Yes,  there  are 
many  such  configurations.  If,  instead  of  forming  closed  circles,  the 
lines  of  force  are  caused  to  twist  as  shown  in  Fig.  5,  they  do  not 
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close  on  themselves  (except  in  certain  cases  where  the  rotation  per 
loop  around  the  torus  is  a  rational  fraction  of  2w).  Each  line  of 
force,  when  followed  many  times  around,  tends  to  form  a  surface- 
termed  a  magnetic  surface— and  the  syrtem  now  consists  of  an 
infinite  set  of  nested  surfaces.  Any  local  charge  excess  tends  to  leak 
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Figure  5.  Twisted  field  lines  which  form  magnetic  “sul  fates." 

off  along  the  field  lines,  distributing  itself  uniformly.  Such  a  sym¬ 
metric  charge  distribution  causes  only  an  electric  field  perpendicular 
to  the  wall.  The  resulting  drifts,  being  (as  shown  earlier)  perpen¬ 
dicular  to  £  and  B,  are  thus  parallel  to  the  walls,  causing  no  losses 
to  the  wall. 

It  is  appropriate  to  define  another  term  which  turns  out  to  be 
important-shear.  When  the  lines  of  force  in  successive  surfaces 
twist  at  different  rates,  we  say  there  is  shear  in  the  field. 

How  do  we  get  magnetic  surfaces?  Fig.  <i  shows  a  number  of 
methods.  In  each  case  the  tube  should  be  regarded  as  surrounded 
by  coils  providing  the  magnetic  field  which  goes  around  the  torus 
(the  long  way).  Only  toe  additional  features  causing  the  twist  are 
shown.  The  torus  may  be  deformed  (figure-8  stellarator).  view  (a); 
helical  windings  may  be  added  to  tire  soleuoi tla l  winding  (helical 
stellarator),  providing  a  transverse  field  component,  view  (b):  a 
current  induced  in  the  plasma  will  provide  rotation  of  the  field 
lines  (e.g.,  Soviet  Tokamak),  view  (c);  a  current-carrying  rod  (e.g., 
levitron  or  spherator),  view  (d).  may  be  used. 

In  all  these  cases  except  the  first-mentioned  (figure-8  stellarator) 
the  magnetic  field  does  possess  shear. 

Before  moving  on  to  describe  the  experimental  results  there  is 
one  additional  concept  which  is  of  imjwrtante-namely,  contain¬ 
ment  time.  A  successful  reactor  obviously  require*  a  greater  power 
output  than  power  input.  The  plasma  particles  must  be  heated 
More  they  will  react.  Each  particle  has,  on  the  average.  10  kcV  of 
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kinetic  energy.  If  it  hits  the  wail,  tins  represents  an  energy  low.  A 
reaction  yields  roughly  103  times  as  much  energy.  To  just  break 
even,  energywtse,  at  least  one  particle  must  engage  in  a  nuclear 
reaction  for  every  thousand  particles  that  escape.  At  temperatures  of 
the  order  of  those  required  for  a  reactor,  the  above  condition  yields 
tlte  totalled  Lawson  titer  ion 

nr>  10**, 

where  n  is  the  number  of  ions  per  eubie  centimeter  anti  r  the  time 
(in  seconds)  before  the  average  ion  escapes  to  the  wall*.  Considera¬ 
tions  of  power  density  and  maximum  available  magnetic  held  indi¬ 
cate  that  n  would  be  in  the  range  10**  to  10*T.  Thus  minimum  ton- 
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tainment  times  of  the  order  of  .001  to  1  second  are  required.  Now 
this  is  a  very  long  time  when  one  considers  the  fact  that  the  ion 
thermal  speeds  are  about  5  X  10s  cm/sec.  This  brings  us  to  the 
crux  of  the  matter.  What  are  the  processes  which  result  in  the  loss 
of  particles?  As  I  stated  before,  the  system  is  not  in  thermodynamic 
equilibrium  from  a  number  of  standpoints.  First,  the  plasma  is  thin 
compared  with  an  absorption  length  for  radiation.  Therefore,  it 
does  not  (fortunately  indeed)  radiate  as  a  blackbody.  Second,  we 
require  that  the  density  faH  off  to  a  low  value  at  the  walls;  there 
is  a  density  gradient.  Third,  the  velocity  distribution  ..  ay  be  signifi¬ 
cantly  non-Maxwellian. 

We  are  concerned  here  with  particles  escaping  across  the  mag¬ 
netic  field  lines.  Obviously,  a  single  particle  circling  around  the 
field  lines  should  remain  there  indefinitely  (if  it  doesn’t  radiate). 
When  many  particles  are  present,  collisions  occur.  After  each  colli¬ 
sion  the  particle  starts  moving  in  u  different  circle— displaced  from 
the  original  one.  This  pr  results  in  a  gradual  diffusion  from 

regions  of  high  density  to.  regions  of  low  density.  The  contain¬ 

ment  time  calculated  on  the  basis  of  this  mechanism  should  be 
proportional  to  B2T1/2  and  is  long  compared  w»th  our  requirements 
Much  more  serious  is  the  matter  of  losses  due  to  instabilities.  Sup 
pose,  for  example,  that  a  wavelike  disturbance  exists  in  the  plasma, 
causing  oscillating  electric  fields  as  well  as  varying  particle  densities. 
In  a  region  where  the  field  exists,  all  of  the  particles  or  a  significant 
fraction  of  them  will  drift  together.  It  is  apparent  that  low-frequency 
electric  fields  can  be  particularly  effective  in  causing  large  motions. 

It  happens  that  a  plasma  confined  by  a  magnetic  field  vends  to  be 
unstable  with  respect  to  various  wave-modes,  and  so  our  task  turns 
out  to  be  that  of  finding  means  of  suppressing  as  many  of  these 
modes  as  possible  and  of  limiting  the  amplitude  of  those  that  do 
grow.  This  is  a  complex  task,  but,  as  you  will  see,  we  are  making 
substantial  progress. 

MODEL  C  .STELLAR A  (  OR  RESULTS 

The  Model  C  stellarator  was  built  in  order  to  find  the  contain¬ 
ment  properties  of  a  toroida',  sheared  magnetic  field;  a  helical  >.tcl- 
larator.  I  do  not  propose  at  this  time  to  go  into  the  techniques  of 
plasma  creation,  impurity  control,  and  measurement.  Suffice  it  to 
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say  that  these  are  complex  matters  and  that  a  great  deal  of  ingenuity 
and  hard  work  has  been  involved.  I  will  simply  summarize  the 
results  of  the  group  led  by  Spitzer  and  Grove.  Measurements  have 
been  carried  out  over  an  extremely  wide  range  of  plasma  parameters 
on  Model  C,  far  more  than  on  any  other  plasma  device.  These  are 
shown  in  Fig.  7.  The  results  on  the  critical  quantity-containment 
time  are  shown  on  Fig.  8.  Within  a  factor  of  2,  all  the  points  lie 
ah  ag  a  line  called  “Bohm  time.’'  The  containment  time  is  propor¬ 
tional  to  B,  not  B2,  and  to  1  /T  instead  of  T1/2.  I  will  come  back  to 
“Bohm  time”  shortly.  Suffice  it  to  say,  at  this  point,  that  until  very 
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Figure  8.  Containment  Hme  vs  magnetic  field  in  the  Model  C  stellarator. 


recently  there  existed  no  adequate  theoretical  derivation  of  the 
Bohm  formula. 

When  one  examines  the  local  behavior  of  this  plasma,  large 
fluctuations  are  observed  in  the  density;  fluctuating  electric  fields 
are  observed.  How  do  we  arrive  at  some  understanding  of  what  is 
going  on  here?  Well,  for  a  hint,  let’s  go  back  and  look  at  some  of 
the  mirror  system  results. 

In  a  magnetic  mirror  configuration  the  lines  of  force  all  bulge  out¬ 
ward— which  is  equivalent,  from  the  standpoint  of  Maxwell’s  equa¬ 
tion,  to  the  statement  that  the  magnitude  of  the  field  falls  as  one 
moves  away  from  the  axis.  As  I  pointed  out  earlier,  in  a  nonuniform 
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B  field  the  particles  drift— electrons  in  one  direction,  ions  in  the 
opposite— perpendicular  to  B  and  V  B.  In  a  cylindrically  symmetric 
plasma,  this  drift  about  the  axis  produces  no  charge  separation.  If, 
however,  the  surface  is  perturbed  slightly  (as  shown  schematically 
in  Fig.  9),  the  drifts  do  produce  a  charge  separation.  Furthermore, 
the  resulting  electric  field  produces  an  additional  motion  in  such  a 
direction  as  to  cause  the  original  bump  to  grow.  Electric  charges, 
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Figure  9.  The  flute  instability. 

which  may  initially  grow  at  a  particular  local  region,  tend  to  spread 
out  along  a  field  line  and  the  "bump”  tends  to  take  on  the  form  of 
a  long  ripple  or  flute. 

Note  one  important  idea.  If  the  magnetic  field  gradient  were  in 
the  opposite  direction  (if  the  curvature  bulged  inward),  the  ions 
and  electrons  would  drift  in  the  opposite  directions,  the  charge  ac¬ 
cumulations  would  be  interchanged  (-(-  and  — ),  the  electric  fields 
reversed,  and  the  bulge  (instead  of  growing)  would  be  pushed  back 
down— the  system  would  be  stable  against  a  flute.  Ioffe,  in  1955, 
reported  experimental  results  supporting  these  ideas.  He  added  to 
the  ordinary  mirror  system  a  set  of  parallel  bars,  each  bar  carrying 
a  current  opposite  in  direction  from  its  neighbors  (Fig.  10).  While 
the  pattern  of  the  lines  of  force  is  somewhat  complicated,  it  is  clear 
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Figure  10.  Ioffe  system  for  producing  a  field  minimum  in  the  central 
region. 

that  at  the  axis  the  net  field  due  to  the  bars  is  zero,  and  that  as  one 
moves  outward  toward  the  bars,  the  field  due  to  the  bars  increases. 
Thus,  there  exists  a  region  inside  such  that  the  magnitude  of  B 
increases  as  one  moves  outward  in  any  direction— i.e.,  there  is  a 
minimum  B  region.  Ioffe  found  a  factor  of  30  increase  in  confine¬ 
ment  time  when  sufficient  current  was  passed  through  the  bars  to 
create  a  minimum  |B|. 

Since  that  time  another  instability,  called  a  drift  mode,  has  been 
identified.  It  also  tends  to  form  long  flutes.  The  description  of  the 
particle  motions  is  somewhat  complicated,  and  I  won’t  attempt  it 
here.  Suffice  it  to  say  that  this  mode  is  even  more  dangerous  than 
the  mode  I  described,  in  that  it  is  a  low-frequency  mode  and  harder 
to  stabilize.  It  tends  to  exist  whenever  there  is  a  pressure  gradient 
(and  therefore  it  sometimes  is  called  a  “universal”  inode).  Additional 
driving  terms  are  electron-ion  collisions,  field  curvature,  and  plasma 
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current.  According  to  theory,  this  mode  too  should  be  suppressed 
by  a  minimum  B  field  geometry. 

However,  true  minimum  B  in  toroidal  form,  with  the  lines  of 
force  not  crossing  a  wall,  is  impossible,  for  in  order  to  have  a  mini¬ 
mum  B  the  lines  of  force  must  bend  outward.  This  is  incompatible 
with  their  staying  inside  a  toroidal  region. 

A  possible  solution  was  proposed  by  Furth  and  Rosenbluth.  The 
idea  may  be  described  as  follows.  Suppose  the  lines  of  force  alter¬ 
nately  curve  outward  (tending  to  be  a  stable  region)  and  inward 
(tending  to  be  unstable).  Now  an  instability  with  a  long  wavelength 
may  involve  both  regions,  and  for  such  modes  it  is  of  interest  to 
investigate  the  net  stabilizing  or  destabilizing  effect— averaged  along 
the  line.  The  pertinent  quantity  turns  out  to  be  /d£/B,  where  cl£ 
is  an  element  of  length  along  the  line  of  force.  If  this  quantity  de¬ 
creases  as  one  moves  outward  toward  the  wall,  then  long  wavelength 
drift  waves  should  be  stable.  This  concept  is  called  average  minimum 
B.  How  about  short  wavelengths?  It  turns  out  that  short  wavelengths 
will  also  be  stable  provided  the  regions  of  “good”  and  “bad”  curva¬ 
ture  are  sufficiently  close  together  (so-called  short  connection 
lengths).  This  can  be  viewed  essentially  in  the  following  terms.  A 
disturbance  which  starts  to  grow  in  the  bad  region  tends  to  spill 
plasma  over  into  the  good  region— the  rate  of  flow  being  essentially 
the  ion  thermal  velocity.  If  the  plasma  reaches  the  good  region  in  a 
time  shorter  than  the  growth  time  of  the  instability,  a  strong 
stabilizing  effect  should  take  place.  High  ion  temperatures  are 
therefore  advantageous. 

Thus  an  average  minimum  B,  together  with  short  connection 
length,  should  tend  to  stabilize  the  drift  wave. 

lliere  is  another  stabilization  scheme  which  should  in  theory 
also  be  effective,  and  that  is  to  have  stror,^  hear  in  the  magnetic 
field. 

There  is  a  considerable  amount  of  corroborative  experimental 
evidence  for  both  these  stabilization  methods.  I  will  mention  only 
a  few  here. 

The  minimum-average-B,  short-connection-length  concept  has 
been  applied  in  experiments  by  Kerst  at  Wisconsin  and  Ohkawa  at 
General  Atomic.  The  apparatus  consists  of  a  toroidal  tube  contain¬ 
ing  four  circular  current-carrying  rods  shown  schematically  in  Fig. 


180  JOURNEYS  IN  SCIENCE 


11.  Typical  lines  of  force  are  shown  in  this  figure.  There  is  one  set 
of  lines  which  cross  at  the  center  (at  which  point  the  field  must  be 
zero),  labelled  “Separatrix”  (tys).  Lines  closer  to  the  rods  than 
circle  only  one  rod.  Lines  further  from  the  rods  enclose  all  four  rods; 
for  example,  that  labeled  -4>crlt  (a  critical  line).  The  region  inside 
rjjerit  should  be  stable  from  the  /d£B  or  average  minimum  B 
standpoint.  The  region  outside  il>,.rlt  should  he  unstable.  This  may 
be  anticipated  from  the  fact  that  as  one  moves  outward  the  lines  of 


Figure  1 1.  The  toroidal  octupole. 


force  gradually  approach  circles.  Circles,  of  course,  have  only  "bad” 
curvature.  Thus  far  experiments  have  been  carried  out  only  at 
relatively  low  densities.  The  experiments  indicate  that  the  plasma 
outside  ■viVrit  shows  large  density  fluctuations,  but  inside  ^crtt  the 
fluctuations  are  very  small,  indicating  that  the  containment  time 
may  be  very  long  compared  with  the  Bohm  time.  Thus  far,  actual 
confinement  time  measurements  are  limited  due  to  the  fact  that 
supports  for  the  four  rod j  extend  into  the  plasma.  The  observed 
plasma  lifetime  is  in  agreement  with  that  which  would  be  expected 
due  to  these  obstructions. 

Experiments  are  now  being  planned  in  which  the  rods,  instead  of 
being  supported  mechanically,  will  be  supported  by  magnetic  fields. 
In  order  to  eliminate  current  feeds  which  provide  similar  obstruc¬ 
tions,  the  rings  will  be  superconducting.  Existing  technology  now 
seems  adequate  for  such  a  device. 
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For  evidence  on  shear  stabilization  let  us  look  at  the  results  of  an 
experiment  by  Chen.  He  has  a  long,  straight  solenoid  with  a  uniform 
magnetic  field  parallel  to  the  axis.  Down  the  center  of  the  table  runs 
a  long  rod  through  which  a  current  may  flow.  The  magnetic  field 
is  twisted  and  sheared  by  this  current.  At  one  end  of  the  tube  is  a 
(constant)  thermal  source  of  potassium  plasma.  A  cross  section  of 
the  tube  is  shown  in  Fig.  12.  A  probe,  measuring  density,  is  moved 
along  the  path  shown  by  the  dotted  line.  One  anticipates  that,  since 


Figure  12.  Solenoid  with  shear  produced  by  current  in  an  axial  rod. 
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the  plasma  is  absorbed  both  at  the  outer  wall  and  at  the  rod,  the 
density  distribution  should  be  of  the  form  shown.  Maintaining  a 
constant  plasma  source,  the  shear  is  increased  by  bringing  up  the 
current  in  the  rod.  Figure  13  shows  that  the  density  continues  to 
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rise  as  the  shear  is  increased,  indicating  that  the  losses  are  indeed 
decreasing  with  shear.  Noise  measurements  on  this  plasma  also 
show  the  strong  stabilizing  effects  of  shear. 

There  is  one  other  factor  which  has  a  stabilizing  influence  and 
that  is  viscosity.  Viscosity  would  have  only  a  very  small  effect  in 
the  regime  of  interest  to  CTR;  however,  it  is  possible  to  test  the 
validity  of  the  theory  by  doing  experiments  in  which  viscosity  does 
play  a  role.  Recently  Grove  calculated  that  by  going  to  a  regime  of 
low  density  (so  that  the  collisional  driving  terms  arc  small),  and  to 
high  ion  mass  (using  xenon  so  as  to  increase  the  Larmor  radius  and 
thus  increase  the  viscosity),  and  by  using  somewhat  more  shear  than 
was  heretofore  available,  he  could  theoretically  get  into  a  regime 
where  all  but  one  mode  was  stable  and  this  one  unstable  mode  was 
predicted  to  have  a  slow  growth  rate. 

At  this  point  I  need  to  go  off  on  a  tangent  again.  The  relevant 
point  will  emerge  shortly.  Plasma  theory  is  an  enormously  compli¬ 
cated  field.  The  basic  equations  are  nonlinear.  Perturbation  tech¬ 
niques  are  used  to  linearize  the  equations  in  order  to  obtain  solu¬ 
tions.  The  descriptions  of  the  wave  modes,  for  example,  are  valid 
only  so  long  as  the  wave  amplitudes  are  small.  The  growth  rates 
predicted  by  linearized  theory  cannot  continue  to  be  correct  as  the 
amplitude  increases.  Obviously  the  energy  in  the  wave  must  remain 
finite- there  must  be  a  limiting  amplitude  to  which  the  wave  grows. 
This  amplitude  should  then  correspond  to  a  definite  plasma  loss 
rate.  There  exists  at  this  time  no  completely  satisfactory  method  of 
calculating  the  limiting  amplitude.  Approximate  calculations  (quasi- 
iincar)  for  some  modes,  by  Galeev,  give  a  diffusion  coefficient  of  the 
order  of  j*/h j^.  where  v  is  the  growth  rate  predicted  by  linear 
theory,  and  kj.  is  the  wave  number  of  the  unstable  mode.  Finally 
we  come  to  the  point.  This  theory  predicts  a  diffusion  rate  that 
leads  to  Rohm  diffusion  for  the  regime  in  which  Rohm  diffusion  was 
observed  in  the  C  stellarator.  Rut  to  go  one  step  further,  in  the 
xenon  low-density  regime  referred  to  earlier,  this  same  theory  pre¬ 
dicts  that  the  diffusion  rate  that  results  from  the  unstable  mode 
referred  to  should  drop  by  about  an  order  of  magnitude.  .Surpris¬ 
ingly  enough,  the  experiment  performed  recently  did  show  a  diffu¬ 
sion  rate  lower  by  a  factor  of  !>. 

In  summary,  there  is  beginning  to  he  a  very  encouraging  corres¬ 
pondence  between  our  theoretical  ideas  and  our  experimental  ob- 
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servations.  I  do  not  for  a  moment  believe  that  our  knowledge  is  at 
all  complete  or  even  adequate.  But  we  are  making  solid  progress. 

There  are  important  developments  in  other  confinement  experi¬ 
ments  which  !  have  not  been  able  to  discuss  in  this  limited  time. 
These  include  the  Astron  experiment  involving  relativistic  electron 
layers,  the  programs  involving  high  energy  ion  injection  into  mirror 
geometries  and  the  successful  creation,  without  apparent  instabil¬ 
ities,  of  high-jS  (high  density  and  temperature)  plasmas  in  the  theta 
pinches.  Since  I  personally  feel  that  the  hope  for  a  successful  future 
CTR  is  along  the  path  outlined  by  the  toroidal  work  I  have 
described,  this  has  been  my  emphasis. 


VIII.  The  Internal  Structure  of 
Shock  Waves 

Hans  W.  Liepmann 


INTRODUCTION 

Figure  1  is  a  flow  picture  of  a  supersonically  flying  projectile;  it 
shows  clearly  a  fine  scale  structure,  the  most  typical  and  most  re¬ 
markable  characteristic  of  high  speed  flow.  The  picture  is  taken 
using  the  so-called  shadowgraph  technique  which  is  particularly 
sensitive  to  rapid  changes  in  the  density  of  the  gas.  The  sharp 
straight  lines  are  shock  waves,  the  grain-like  structure  turbulence 
and  finally,  in  the  region  between  the  shock  the  structure  looking 
like  twisted  spaghetti  is  due  to  random  acoustic  waves  emitted  by 
the  turbulence. 

The  tendency  of  fluid  flow  toward  concentration  of  gradients  of 
velocity,  density,  temperature,  etc.,  along  sheets  or  lines  is  its  single 
most  apparent  and  most  interesting  feature.  The  trend  is  intimately 
connected  with  the  non-linear  character  of  the  equations  of  motion 
and  most  obvious  in  the  breaking  of  water  waves  on  beaches.  In 
any  high  speed  motion  there  thus  exists  a  race  between  the  steepen¬ 
ing  tendency  due  to  the  non-linear  inertia  effects  and  a  smoothing 
tendency  due  to  the  various  diffusive  phenomena  like  viscosity,  heat 
conductivity,  etc.  This  race  is  essential  for  turbulence,  water  waves 
and  shock  waves.  In  turbulence  it  is  the  concentration  and  diffusion 
of  vorticity,  in  water  waves  the  concentration  and  diffusion  of  the 
surface  slo|>es,  in  shock  waves  the  concentration  and  diffusion  of 
changes  iu  velocity,  density,  etc.  For  turbulence  the  non  linear  aspect 
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of  the  steepening  is  more  difficult  and  more  interesting,  because  the 
concentration  of  vorticity  is  an  essentially  three-dimensional,  very 
involved  effect.  Shock  waves,  on  the  other  hand,  are  basically 
longitudinal  waves,  hence  can  occur  and  can  be  studied  in  one  space 
dimension.  The  non-linear  effect  in  one  dimension  can  be  easily 
handled,  but  the  diffusive  effects  which  ultimately  lead  to  a  definite 
“shock  structure",  i.e.,  a  well-defined  transition  layer,  are  of  primary 
interest  because  within  a  strong  shock  the  gradients  can  become  so 
large  that  the  usual  approach  to  momentum,  heat  or  matter  diffu¬ 
sion  becomes  inapplicable.  The  transition  layer,  i.e.,  the  flow  within 
a  shock,  thus  becomes  a  test  case  for  attempts  to  extend  hydrody¬ 
namic  theory  into  the  regions  which  are  far  from  thermodynamic 
equilibrium.  Even  the  very  simplest  thermodynamic  working  fluid, 
the  perfect,  monatomic  gas,  presents  problems  and  is  indeed  the  one 
we  will  discus  here. 

PRODUCTION  AND  PROPAGATION  OF  SHOCKS 

The  typical  way  to  make  a  shock  is  illustrated  in  the  "piston 
problem":  A  piston  slides  frictionless  in  an  unlimited  tube  and  at 
time  t  s;  0  starts  to  move,  compressing  the  gas  to  tlte  right.  If  the 
ultimate  piston  velocity  is  uniform,  say  u.  then  after  a  white,  regard- 
less  af  the  way  u  is  reached,  a  shock  wave  of  unique  character 
propagates  altead  of  the  piston  into  the  undisturbed  gas.  This 
“uniqueness"  or  the  permanent  nature  of  the  shock  means  that  the 
“wave"  is  a  transition  layer  propagating  with  a  fixed  velocity  c 
which  separates  two  equilibrium  states  (1)  and  (2),  with  well  defined 
characteristics:  ahead  of  the  wave  the  velocity  u  s  0,  the  density 

5=  pu  temperature  T  s=  Yt.  etc.  After  the  wave  has  passed  u  ==  u. 

T;. etc. 

It  is  an  easy  task  to  relate  the  variables  ol  state  by  the  so-called 
“jump  conditions".  For  example  tire  density  jump  is  related  to  u  and 
<  by 

p-  pt  _  n 

pt  t  ~  M 

Two  mote  jump  conditions  can  be  obtained  from  momentum 
and  energy  conservation.  If  one  finally  adds  an  equation  of  state 
which  specifies  the  substance,  for  example  the  gas.  in  the  tube  one 
ends  up  with  a  debited  problem:  Knowing  the  Quid  in  die  tube  and 
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the  pisto*1  velocity  u,  the  speed  c  of  the  resulting  shock  wave  as  well 
as  the  conditions  existing  once  it  has  passed  are  determined.  Note 
that  this  can  be  done  without  knowledge  of  any  of  the  transport 
parameters  of  the  fluid  by  applying  simple  thermodynamics  and 
mechanics. 

STEEPENING  AND  DIFFUSION  OF  VELOCITY  GRADIENTS 

In  speaking  about  a  “wave”  or  a  “transition  layer”  we  imply  that 
the  changeover  from  the  undisturbed  to  the  disturbed  state  takes 
place  rapidly,  so  that  the  thickness  of  the  transition  zone,  6,  is  small 
compared  to  the  tube  dimensions.  In  fact,  what  does  determine  5? 
Dimensional  analysis  and  similarity  require,  after  all,  that  8  has  to 
be  a  multiple  of  some  characteristic  length,  &  of  the  problem.  Since 
it  does  not  depend  on  the  tube  diameter  or  any  length  parameter  to 
be  made  up  from  the  starting  characteristics  of  the  piston,  it  can 
only  depend  upon  an  intrinsic  length  parameter  of  the  gas  itself. 
A  perfect  gas  is  thermodynamically  characterized  by  its  molecular 
mass,  in,  only;  the  molecular  “size”,  the  collision  cross  sections  can 
be  neglected.  Irreversible  transport  phenomena  like  vhcositv  do 
depend,  of  course  on  the  molecular  “size”,  thus  depending  on  the 
collision  cross  sections  or  mean  free  path  of  molecular  travel  between 
collisions,  A.  Hence  it  is  obvious  that  the  shock  thickness  depends 
on  a  or  on  the  transport  coefficients,  like  viscosity,  related  to  A 

Consequently,  a  shock  wave  of  a  given  “strength”,  measured  by 
its  propagation  velocity  c,  the  pressure  ratio  p2/pi  or  in  any  other 
suitable  way,  will  possess  a  structure  determined  by  the  gas  and  will 
be  independent  of  the  way  the  shock  was  produced. 

Hence  the  ultimate  balance  between  inertial  steepening  and 
diffusive  smoothing  is  independent  of  the  macroscopic  setup  of  the 
problem.  Indeed  the  time,  to,  it  takes  for  a  velocity  profile,  which  at 

t  =  0  possesses  a  maximum  gradient  s  — J,  to  develop  a  jump,  i.e„ 

to  make  s  -»  oo  is  simply  inversely  proportional  to  s,  or 
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the  time  tj  to  smooth,  by  diffusive  action  alone,  a  jump  into  a  profile 
with  a  width  5  is 

tx  =  S7v 

where  v  =  p/p  is  the  ratio  of  viscosity  to  density,  the  so-called  kine¬ 
matic  viscosity.  Noting  that 

S  =5-4^ 

h 

the  two  effects  balance  if  (Fig.  2) 


Shock  Formation 


Nonlinear,  zero  viscosity 


linear,  finite  viscosity 


Figure  2.  The  balance  of  inertia  and  viscous  effects  in  the  shock  formation 
process. 

i.e.,  an  estimate  of  the  thickness  of  the  transition  zone  is  easily  ob¬ 
tained.  For  example,  for  a  strong  shock  propagating  into  standard 
air  we  find 
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v,  ~.16CGS 


Au  =  u~  c 

Hence  if  c  ~  10a,  a  the  velocity  of  sound  in  air,  3  X  hf  cm/sec. 

a  ~  T"?rr  5x  l0-7cm 

3x10® 

This  extremely  crude  result  serves  mainly  to  emphasize  the  dimen¬ 
sions  involved.  Actually,  the  computed  thickness  5  is  less  than  the 
mean  free  path  A  and  hence  our  derivation  using  v  is  not  self- 
consistent.  Indeed,  the  reasoning  based  on  v  is  adequate,  correct  in 
fact,  for  weak  shocks  for  which  c  — »  a  and  Au  -»0,  but  for  strong 
shock  b  will  remain  equal  to  a  few  mean  free  paths. 

One  may  add  here  that  one  has  not  found  a  similar,  unique  and 
typical  element  of  turbulence,  that  is  an  experimentally  realizable 
“vortex”,  which  demonstrates  the  non-lincar-viscous  race  in  turbu¬ 
lent  motion. 

ENTROPY 

The  shock  compression  and  the  resulting  heating  of  the  gas  is  a 
rapid  process  in  which  the  fluid  passes  through  a  state  of  non¬ 
equilibrium  within  the  transition  layer.  A  change  of  state,  in  which 
the  material  is  always  close  to  equilibrium,  is  reversible  in  the 
thermodynamic  sense.  If  during  the  change  of  state,  non-equilibri¬ 
um  conditions  occur  in  the  system,  the  process  is  irreversible;  the 
entropy  of  the  system  increases.  The  specific  entropy  of  the  gas  s,  the 
entropy  per  unit  mass  is  consequently  larger  in  the  gas  after  the 
shock  has  passed,  i.e., 

S2  — '  S1  >0 

and  the  difference  increases  with  the  strength  of  the  shock.  Since 
entropy  is  a  variable  of  state,  s2  —  sj  can  be  computed  without 
reference  to  the  processes  going  on  in  the  transition  layer.  The 
“production”  of  entropy  occurs,  of  course,  in  the  transition  zone 
which  adjusts  its  thickness  and  hence  the  slopes  of  the  velocity  and 
temperature  profiles,  to  produce  the  necessary  increase  in  entropy. 
Within  the  range  of  applicability  of  the  classical  equation  of 
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hydrodynamics— or  for  that  matter,  within  the  regime  of  so-called 
irreversible  thermodynamics— the  entropy  production  is  propor¬ 
tional  to  the  square  of  the  velocity  and  temperature  gradients. 


/■* 


/duv^/dTV  where  u  and  k  are  viscosity  and  heat  conductivity 
V  dx /  \ dx  / 


respectively. 

Thus  the  stronger  the  wave,  the  thinner  the  shock,  the  larger  the 
gradients  and  the  larger  the  entropy  change.  However,  the  thick¬ 
ness  change  must  be  such  that  the  total  entropy  increase  becomes 
independent  of  the  transport  parameters  n  and  k.  It  is  easily  seen 
that  this  is  true  if  8  scales  linearly  with  /i  and  k,  a  result  which  fol¬ 
lowed  already  from  the  balance  of  diffusion  and  steepening. 

Beyond  the  range  of  applicability  of  concepts  like  viscosity  and 
heat  conductivity  any  complete  theory  must  show  these  geneial 
features:  An  entropy  production  due  to  stress  and  heat  flux  within 
the  transition  zone  which  depends  on  molecular  interactions  involv¬ 
ing  the  range  of  the  intermolecular  force  fields,  i.e.,  for  a  gas,  the 
collision  cross  section.  The  form  of  the  production  terms  must  be 
such  that  the  total  entropy  production  becomes  independent  of  the 
details  of  the  molecular  processes. 

DISTRIBUTION  FUNCTION 

We  restrict  the  discussion  now  to  a  simple  monatomic  gas  like 
argon.  For  such  a  gas,  one  can  define  a  distribution  function  f  such 
that  f(v,  x,  t)  denotes  the  number  of  molecules  which  arc  found  at  a 
given  position  in  space  at  a  given  time  and  with  a  specified  velocity 
within  certain  tolerances,  f  has  the  character  of  a  probability  den¬ 
sity  or  statistical  weight,  macroscopically  measurable  quantities  arc 
the  averages  formed  using  f:  For  example  the  number  of  particles  at 
x  and  t  is  simply  the  sum  or  integral  of  f  summed  over  all  veloci¬ 
ties  v. 

n  =  1 1  dv 

Similarly  the  mean  velocity  u  at  x  and  t  is  given  by 

nu  =  J  v  f  dv 
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The  difference^  between  the  velocity  v  and  the  mean  velocity  u  is 
usually  called  C 

v  —  u  —  G 

C  is  the  random  component  of  molecular  velocities  as  seen  by  an 
observer  traveling  along  with  the  mean  velocity  u. 

Knowing  f,  any  macroscopically  measurable  variable  of  the  gas 
can  be  computed.  Indeed,  f  includes  much  more  information  than 
we  usually  need  and  it  is  therefore  not  too  surprising  that  the 
determination  of  f  is  very  difficult.  The  equation  from  which  f  can 
be  obtained  in  principle  is  the  famous  Boltzmann  equation:  The 
distribution  function  is  altered  by  collisions  between  molecules. 
Each  collision  process  can  be  computed  in  a  straightforward— but 
not  always  simple-way  from  mechanics  if  the  initial  condition  of  the 
two  colliding  molecules  are  known. 

These  initial  conditions  are  considered  random  parameters;  in 
this  way  a  statistical  element  was  introduced  by  Boltzmann  into 
the  otherwise  completely  deterministic  theory.  The  equation  for 
the  distribution  function  f  can  then  be  written  in  an  innocent 
looking  way 

-ijj**  =  G  —  fL 

stating  that  the  rate  of  change  of  f  is  due  to  a  Gain,  G,  minus  a  Loss 
fL,  i.e.,  molecules  in  the  appropriate  velocity  range  arc  gained  and 
lost  by  collisions,  f  increases  or  decreases  as  the  net  result  of  these 
processes.  Unfortunately  G  and  L  are  complicated  integrals  involv¬ 
ing  f  in  a  non-linear  way. 

In  equilibrium  =  0  and  it  is  zasy  to  show  that  the  distribu¬ 
tion  function  which  is  now  independent  of  x  and  t  reduces  to  the 
well-known  Maxwellian  or  Gaussian  distribution,  F(v). 

If  the  gas  is  near  equilibrium  so  that  f  docs  not  differ  much  (in 
some  sensei)  from  F,  one  can  use  the  so-called  Ghapman-Enskog 
theory  which  shows  that  the  classical  hydiodynainicat  equations 
arc  consistent  with  die  first  approximations  to  the  Boltzmann 
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equation,  i.e.,  near  equilibrium  f  —  F  depends  linearly  on  the 
gradients  of  velocity  and  temperature. 

STRESS  AND  HEAT  FLUX 

Momentum  and  energy  flow  can  be  written  in  terms  of  the  molecu- 
lar^velocities  C;  the  momentum  carried  by  each  molecule  is  simply 
mC,  the  volume  flow  nC,  the  momentum  flow  through  unit  area  in 
unit  time  is 

—  -* 

4*#  «♦ 

nmCC  =  P 

Similarly  the  flow  of  energy  is  equal  to  the  mean  product  of  ffi  C-, 
the  energy  of  a  molecule  times  the  volume  flow  “ 

GffiCC?  =  q 

In  equilibrium,  conditions  in  the  gas  are  uniform,  i.e.,  no  direction 
and  no  position  is  distinguished.  In  this  case  averages  like  as 
well  as  Cjty  and  arc  zero.  Hence  q  =  0  t.jid  P  now  contains 
only  the  three  terms  _ 

nro  muC/,  nmC33 

which,  moreover,  must  be  equal;  the  thermodynamic  pressure  P 
is  thus  defined  by 

P  =  «i»wn(C?  +  &4-C?) 

Now  consider  non-equilibrium  flow.  To  make  matters  simple  spe¬ 
cialize  immediately  to  a  shock  layer.  Heat  flux  and  viscous  stress 
then  have  only  one  component; 


The  stress  component  P„  is  made  up  of  thermodynamic  pressure 
P  and  viscous  stress  t 
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1*1 1  =  p  -f-  r  =  mn  C,- 

hence 

r  =  mu(Cr  -1(")  =  mu  [U?  -  j(C?  +  C?  +  (V)| 
By  symmetry,  the  (1/  =  Ca2  ami  r  anti  t|  can  be  written 


T=i-nin(C|-  -  il/) 

H  =  m(t$  +  2C&2) 


Of  particular  interest  is  tire  ratio  of  r/p 

,  ~c7) 

Wp  ~*n; — zr 

cv  4-  2c:,- 

which  is  a  convenient  non-dimensional  measure  for  the  departure 
from  equilibrium  , irdced,  r/p  is  essentially  the  expansion  para¬ 
meter  in  the  Chapman-Emkog  theory  and  hence  we  know  that  if 
r/p  is  small  compared  to  uuity 


Hence,  as  exacted  for  small  deviations  from  equilibrium,  we  ob¬ 
tain  the  usual  Navier-Stokes  equations  of  fluid  dynamics.  The  tram- 
|>ort  parameters  ^  and  k  are  functions  of  T  which,  in  principle,  can 
be  determined  for  any  iutvt  molecular  forte  law. 

It  is  not  difficult  to  show  that  for  strong  shocks,  *|~  increases 

rapidly  and  exceeds  unity,  at  least  in  parts  of  the  shock  layer.  Hence 
the  Chaproan-Enskog  series  cannot  be  expected  to  converge.  Tor 
strong  shocks  one  must  therefore  find  If?,  v)  and  determine  from  it 
the  measurable  variables  of  state.  Solving  the  Bolumann  equation 
exactly  is  out  of  question;  hence  die  attempts  to  find  (  must  be 
directed  toward  approximations  of  a  different  type  than  the  Ctiap- 
iuati-Er*skog  approach. 
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For  the  shock  layer  problem,  two  types  of  approximations  can 
be  used:  The  Mott-Smith  method  which  approximates  the  distribu¬ 
tion  function  by  a  weighted  sum  of  the  Maxwellians  before  and 
after  the  shock: 

f  =  oF,  +(l~tt)F8 

with  a(x)  a  weighting  function.  a(x)  can  be  determined  in  such  a 
way  that  f  is  an  approximate  solution  to  the  exact  equation.  The 
other  method  consists  of  finding  exact  solutions  to  model  equations, 
i.e.,  equations  which  are  simpler  than  the  Boltzmann  equation  but, 
hopefully,  include  the  essential  features.  The  best  known  of  these 
model  equations  is  the  so-called  3GK  or  Krook  model 

•$-  =  Al>(F-l) 

in  which  F  is  the  local  Maxwellian.  F  depends  therefore  on  u.  n  and 
1'  which  in  turn  are  moments  of  the  unknown  f.  The  equation  is 
consequently  still  non-linear  but  very  much  simpler  than  the  Boltz¬ 
mann  equation.  There  is  little  doubt  that  the  Boltzmann  equation 
contains  much  more  information  than  is  required  for  a  description 
of  an  experimentally  realizable  situation.  It  is  therefore  quite  rea¬ 
sonable  to  study  simpler  model  equations  like  the  one  by  Uhatnagar- 
Gross-Krook  used  here. 

The  model  agrees  closely  with  the  Navier-Stokes  equations  if 
r/p  is  small  and  is  compatible  with  free  molecular,  i.e.,  collision- 
free  flow  iu  the  opposite  limit.  It.  furthermore,  demonstrates  at  least 
some  of  the  interesting  mathematical  properties  of  the  Boltzmann 
equation  in  manageable  form;  in  particular  the  non-uniform  con¬ 
vergence  of  the  Chapman-Enskog  series  is  obvious.  For  the  shock 
wave  problem,  the  model  equations  can  tie  numerically  integrated 
and  all  the  flow  parameters  determined.  Typical  results  showing 
the  development  of  the  distribution  function  through  a  shock  layer 
are  shown  in  Fig.  1 

EXPERIMENTS 

The  measurement  of  the  density  distribution  within  a  shock 
layer  requires  considerable  experimental  skill  and  for  strong  shocks 
has  been  successfully  carried  through  only  in  recent  years.  The  most 
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Fictile  3.  Velocity  dmribuiioo  function  at  various  points  within  a  shock 
wave. 


accurate  and  complete  data  have  been  taken  with  the  help  of  the 
absorption  of  an  electron  beam  recently  by  Schmidt  at  GALCIT; 
the  principle  of  the  method  and  sample  results  are  shown  in  Figs. 
4,  5  and  6.  The  density  distribution  is  compared  with  BGK  compu¬ 
tation  as  well  as  with  the  corresponding  solution  of  the  Navier* 
Stokes  equations.  The  latter  becomes  clearly  less  and  less  adequate 
as  the  shock  strength  increases;  the  overall  behavior,  e  g.,  the  "thick¬ 
ness”  for  strong  shocks  is  well  represented  by  the  BGK  computa¬ 
tions.  Details  in  the  distribution  are,  however,  different.  These 
differences  are  significant  in  piiqsoiiHiug  the  various  fritysical  phe¬ 
nomena  which  combine  to  produce  the  shock  profile.  For  example, 
the  tong  upstream  precursor  of  the  BGK  distribution  is  qualitatively 
certainiy  real  and  represents  the  diffusion  of  “hot",  that  is,  fast 
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Kicts*  <1.  Electron-beam  dcuw  tome  ter  applied  to  shock  uuructure  measure* 
menu  in  a  shock  rube. 


molecule*.  It  is  quantitatively  exaggerated  because  of  tlte  molecular 
mode)  which  corresponds  to  a  very  “soft"  {sciential. 

The  use  of  so|)hUttcated  shock  tubes  and  the  study  of  modu 
equations  in  these  recent  studies  of  shock  structure  (10%%  gone  a  tong 
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way  in  clarifying  rarefied  gas  flows,  in  particular  in  the  transition 
/one  where  neither  the  Navicr-Stokcs  nor  the  collisionlcss  approxi¬ 
mation  holds.  In  particular,  it  has  Ixrcomc  clear  why  and  how  the 
difficulties  with  the  Chapman-Enskog  procedures  ap|>ear  if  the 
method  is  applied  far  from  equilibrium.  The  interplay  of  heal 
molecular  interactions  which  typically  lead  to  transport  propor¬ 
tional  to  the  gradients  of  the  observables  and  of  global  interactions 
of  fast  molecules  with  long  free  paths  can  be  studied  very  success¬ 
fully  using  the  shock  structure  as  a  mathematical  and  experimental 
model.  It  is  in  this  sense  that  the  use  of  model  equations  has  ad¬ 
vantages  over,  say,  the  Mott-Smith  approximation  which  can  give 
the  shock  thickness  aud  the  shock  profdc  as  well  as,  or  better  and 
with  less  labor. 


IX.  Mappings  as  a  Basic 
Mathematical  Concept 

Saunders  Mac  Lane 

Mathematical  research  is  sometimes  devoted  to  attacking  and 
solving  explicit  hard  problems;  such  problems  may  arise  within 
mathematics  itself  or  in  one  of  its  many  applications.  At  other  times, 
mathematical  research  is  concerned  with  the  disclosure  and  develop¬ 
ment  of  new  general  concepts,  especially  those  which  are  "ab¬ 
stracted”  Lorn  particular  mathematical  situations.  In  some  cases, 
these  abstractions  will  themselves  later  lead  to  explicit  hard  prob¬ 
lems;  in  all  cases,  they  should  lead  to  a  clarification  and  better 
understanding,  because  such  ideas,  though  abstract,  can  be  simple. 

This  talk  will  deal  with  an  example  of  this  second  type  of  general 
development.  It  is  thus  a  report  on  a  sample  of  one  sort  of  basic 
research.  The  particular  example  to  be  discussed  is  one  which  we 
hope  will  be  useful  in  organizing  and  codifying  mathematical 
knowledge.  This  is  a  task  which  is  pressing  today,  in  view  of  the 
rapid  rate  at  which  knowledge,  mathematical  and  otherwise,  is 
growing. 

The  abstract  idea  to  be  examined  is  that  o  1  a  “diagram  of  map¬ 
pings".  This  might  also  be  called  a  “block  diagram”  or  “arrow 
diagram”.  Samples  of  such  diagrams  are: 

p - 
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There  are  many  others,  including  much  bigger  ones  (even  with  an 
infinite  number  of  different  arrows).  We  call  such  an  arrow  diagram 
a  category  and  explain  this  notion  as  follows.  A  category  consists  of 
vertices  P,  Q,  R,  . . .  and  arrows  f,  g  ...  Kadi  arrow  starts  from  a 
vertex  P  and  ends  at  some  vertex  Q  {which  may  happen  to  be  the 
same  vertex,  Q  —  P).  We  write  f:  P  — >  Q  to  indicate  that  f  starts 
at  F  (or,  has  “domain"  P)  and  ends  at  Q  (or,  has  “codomain”  Q). 
Most  important,  tire  compound  path  formed  by  following  two  suc¬ 
cessive  arrows  f  and  g  is  always  represented  by  another  single  arrow 
i,  called  the  composite  c  of  g  with  f,  and  written  as  c  =  g°f,  as  in 
the  diagram 

Q 


This  requirement  can  be  stated  more  formally  as  follows:  if  the 
arrow  g  starts  where  the  arrow  t  ends,  then  the  composite  arrow 
c  =  g°f  is  defined;  it  s'~.rts  where  f  does  and  ends  where  g  does. 

Given  three  arrows  1  g,  and  h  in  sir  cession,  we  can  then  form  a 
triple  composite,  as  in  the  following  figure. 


This  successive  composition  can  be  done  in  two  different  ways: 
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first  compose  g  with  f,  and  then  h  with  the  result,  or  first  h  with  g, 
and  then  the  result  with  f,  as  indicated.  We  require  as  an  axiom, 
that  the  results  of  these  two  different  triple  composites  be  equal  or, 
in  symbols: 

(1)  h°  <g°f)  =  (h°g)‘  f. 

1'his  axiom  is  called  the  associative  law  for  composition,  because  it 
is  like  the  associative  law  x(yz)  —  (xy)z  for  the  multiplication  of 
numbers  x,  y,  and  z. 

Another  law  for  the  multiplication  of  numbers  is  the  property  of 
the  number  1,  which  states  that  ly  =  yl  for  any  number  y;  we  say 
that  the  number  1  is  an  “identity”  for  multiplication.  Similarly,  in 
an  arrow  diagram,  we  assume  that  for  each  vertex  P  there  is  an 
identity  airow  1,.  which  starts  and  ends  at  P.  In  the  diagram 


t 


lp 

we  have  indicated  two  such  identity  arrows,  those  at  the  vertices 
P  and  Q.  This  diagram  also  suggests  the  composites  which  might  be 
formed  with  such  identities;  our  second  axiom  now  requires  that 
these  composites  satisfy 

(2)  ly  °  f  =  f  =  f  °  \V 

for  any  arrow  f:  P  — *  Q.  With  the  axioms  (l)  and  (2)  the  idea  of  a 
category  (arrow  diagram)  is  completely  defined.  Now  for  a  few  of 
the  many  examples. 

First,  a  function  may  be  regarded  as  an  arrow.  The  usual  graph 
y  =  f(x)  of  a  function  may  be  considered  as  a  rule  which  gives  to 
each  point  Xj  on  an  x-axis,  a  corresponding  point  y,  =  f(Xj)  on  the 
y-axis: 


'£i~-  ‘  ■"  X-'  S&KW? 


’  -tffZ 
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Thus  f  gives  arrows  xt  — >  f(xj),  x2  — »  f(x2),  . . . ,  and  so  “maps”  the 
set  X  of  all  points  x  to  the  set  Y.  In  general,  if  X  is  any  set  (i.e.,  any 
collection  of  elements)  and  Y  another  set,  a  function  f  with  domain 
X  and  range  V  (a  function  on  X  to  Y)  is  any  rule  which  assigns  to 
each  element  x  of  X  an  element  f(x)  of  Y.  We  may  think  of  the 
function  f  as  an  arrow  f  which  starts  at  X  and  ends  at  Y.  (An  arrow 
built  up  from  all  the  individual  arrows  starting  at  an  element  xt  of 
X  and  ending  at  the  element  f(A))  in  Y.)  If  g  is  some  other  function 
starting  at  Y  and  t  -ding,  say,  at  Z,  there  is  an  evident  way  of  form¬ 
ing  a  composite  function  c  =  g°  f: 


It  is  the  function  c  which  assigns  to  each  element  x  of  X  the  element 
g(f(x))  of  Z—  in  other  words,  to  apply  c,  first  apply  f  to  x,  and  then 
g  to  the  result.  This  c  =  g°  f  is  the  usual  composite  function. 


.  w 
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This  defines  the  category  of  sets.  The  vertices  X,  Y,  Z  of  the 
category  are  sets,  and  the  arrows  f:  X  Y  of  the  category  are  the 
functions  with  domain  the  set  X  and  range  Y,  while  composition  of 
functions  is  the  operation  just  now  defined.  Moreover,  each  set 
X  has  an  “identity  function”  1*:  X  X,  which  is  just  the  function 
which  sends  every  element  x  of  X  to  itself.  With  this  description 
one  may  see  that  the  axioms  (1)  and  (2)  for  a  category  both  hold. 

This  category  of  sets  may  be  taken  as  the  starting  point  for  the 
foundations  of  mathematics.  Recently,  the  introduction  of  the  “new 
mathematics”  in  the  schools  has  emphasized  set  theory  and  the 
possibility  of  building  up  all  mathematical  discourse  in  terms  of 
sets  (especially,  using  sets  to  describe  finite  and  infinite  cardinal  and 
ordinal  numbers).  Some  experts  now  hold  that  the  usual  description 
of  set  theory  via  sets  and  their  elements  is  too  limited,  and  might 
be  replaced  by  suitable  axioms  on  the  category  of  sets  (see  reference 
1  below).  This  categorical  approach  (“set  theory  is  obsolete”)  empha¬ 
sizes  the  importance  of  functions  and  uses  this  emphasis  to  bring 
out  more  clearly  the  structure  of  mathematics. 

Finite  categories  may  also  be  constructed.  The  simplest  one  is  the 
category  with  just  one  vertex  (say  a  vertex  P)  and  just  one  arrow,  the 
identity  of  this  vertex.  Another  one  is  the  category  with  two  vertices 
P  and  Q  and  three  arrows;  namely,  the  two  identity  arrows  plus 
one  arrow  a,  from  P  to  Q,  as  in  the  figure 

a 


lp 

In  this  category  the  composites  must  be  defined  as  a°IP  =  a,  ly°a  — 
a,  lP°lp  =  Ip  and  lQ°ly  =  I Q,  all  as  required  by  axiom  (2).  Each 
of  the  other  sample  arrow  diagrams  listed  at  the  start  of  this  article 
may  be  regarded  as  a  category  in  much  the  same  way  (just  add  an 
identity  arrow  at  each  vertex). 

Categories  also  arise  in  geometry.  For  example,  the  usual  rigid 
motions  can  be  regarded  as  the  arrows  of  a  suitable  category.  The 
vertices  of  this  category  will  be  the  metric  spaces;  i.e.,  the  geometrical 
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spaces  where  one  knows  the  distance  between  any  two  points.  In 
more  detail,  a  metric  space  M  is  a  set  of  points  plus  a  rule  which 
assigns  to  each  pair  of  points  x,  y  of  M  a  non-negative  real  number 
d(x,y)  called  the  distance  from  x  to  y.  This  rule  must  satisfy  two 
axioms,  which  require  for  all  points  x,  y,  z  that 

d(x,y)  =  d(y,x),  d(x,y)  -f  d(y,z)  g  d(x,z). 

The  first  states  that  distance  is  symmetric,  while  the  second  is  the 
triangle  axiom: 


3 


(The  “straight”  distance  d(x,z)  from  x  to  z  is  never  longer  than  that 
via  y).  The  line,  the  plane,  and  3-space,  each  with  the  usual  formulas 
for  distances  (i.e.,  for  lengths  of  straight  line  segments)  are  metric 
spaces,  as  are  the  Riemannian  manifolds  of  differential  geometry, 
with  distance  determined  from  the  usual  tensor  g^. 

If  M  and  N  are  metric  spaces,  a  rigid  motion  f  from  M  to  N  is 
by  definition  a  function  f:  M  — »  N  which  preserves  distance,  in  the 
sense  that 

d(f(x)>  %))  =  d(x.y) 

for  all  x  and  y.  This  equation  states  that  the  distance  from  x  to  y 
is  always  the  same  as  the  distance  between  the  image  points  f(x)  and 
g(y).  (Think  of  M  =  N  as  the  plane;  then  f  is  one  of  the  usual 
rigid  motions  of  the  plane  to  itself  say,  a  translation  followed  by 
a  rotation  or  by  a  reflection).  In  any  event,  the  identity  function 
1M:  M  M  on  any  metric  space  M  to  itself  is  always  a  rigid  motion 
in  this  sense.  If  two  functions  f:  M  — >  N  and  g:  N  — »  L  between 
metric  spaces  are  both  rigid  motions,  then  the  function  g°  f:  M  — » 
L  which  is  their  composite  is  clearly  also  a  rigid  motion.  Since  the 
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axioms  (1)  and  (2)  hold  for  the  composite  so  defined,  we  have 
described  a  category,  the  category  of  metric  spaces ,  with  vertices 
the  metric  spaces  and  arrows  the  rigid  motions.  Many  of  the  basic 
properties  of  metric  spaces  can  be  effectively  described  in  terms  of 
this  category— for  example,  the  "completion"  of  a  given  metric  space 
is  a  special  arrow  from  the  given  space  to  a  complete  metric  space. 

This  example  is  typical  of  the  construction  of  many  categories  in 
geometry.  Suppose  we  are  studying  geometrical  objects  G,  G',  G" 
of  some  type,  and  that  an  object  G  of  this  type  is  described  as  a  set 
of  points  with  certain  properties  and  relations.  The  corresponding 
category,  then,  has  these  objects  as  its  vertices,  while  an  arrow  G  -» 
G'  is  just  a  function  from  the  set  G  to  the  set  G'  which  preserves 
all  the  listed  properties  and  relations  of  points  of  G.  Any  composite 
G  — »  G'  — »  G"  of  two  such  functions  is  another  such,  as  L  the 
identity  function  1Q:  G  -4  G  for  any  geometrical  object  of  the  type 
considered.  Since  axioms  (1)  and  (2)  hold,  we  have  constructed  a 
category;  namely,  the  category  of  all  geometric  objects  of  the  given 
type.  It  turns  out  that  many  basic  geometric  constructions  may  be 
formulated  in  perspicuous  ways  in  these  categories. 

Much  the  same  applies  to  algebraic  objects:  one  may  form  appro¬ 
priate  categories  in  which  the  vertices  are  algebraic  objects  (sets 
equipped  with  certain  operations)  and  the  arrows  are  the  functions 
which  preserve  these  operations.  As  a  sample,  we  consider  “monoids” 
(which  are  often  called  “semigroups  with  identity  element”).  We 
define:  a  monoid  M  is  a  category  with  just  one  vertex  P.  This  means 
that  all  the  arrows  m,  n  of  this  category  M  must  start  and  end  at  P. 
Hence  any  two  arrows  of  M  have  a  composite  m°  n  which  is  another 
arrow.  If  we  write  the  composite  m°  n  just  as  a  product  mn,  then 
the  axioms  (1)  and  (2)  for  a  category  simply  state  that  the  product 
is  associative  and  that  there  is  a  special  element  (the  arrow  1) 
which  is  an  identity  for  this  product.  In  other  words,  we  could  have 
defined  a  monoid  M  to  be  a  set  (of  arrows  m,  n)  with  a  product  mn 
which  is  associative  and  which  has  an  identity  element  1.  It  will  be 
easier  to  call  the  m's  in  M  not  arrows  but  elements  of  the  monoid 
M. 

The  definition  of  a  monoid  is  much  like  the  definition  of  a  group; 
a  group  is  a  monoid  in  which  every  equation  mx  =  1  has  a  solution 
x  (the  solution  x  =  nr1  is  the  “inverse”  of  the  dement  m). 

Now  we  define  the  arrows  between  two  monoids  M  and  K.  An 
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arrow  f:  M  -»  K  will  be  a  function  f  on  the  set  M  to  the  set  K  which 
satisfies 

(3)  f(mn)  =  (f(m))  (f(n)) 

for  all  elements  m,  n  of  M.  This  property  (3)  staies  that  f  takes 
products  to  products  (the  product  nan  to  the  product  (fm)  (fn)).  One 
often  says  instead  that  f  “preserves  products”  or  that  f  is  a  “homo¬ 
morphism”  of  multiplication.  If  two  functions  f:  M  — >  K  and  g:  K 
-»  L  between  monoids  both  preserve  products,  so  does  their  compos¬ 
ite  g°f.  For  any  monoid  M,  the  identity  function  1M:  M  — >  M  evi¬ 
dently  preserves  products.  Hence  we  have  constructed  a  category,  the 
category  of  monoids,  which  has  all  monoids  as  vertices  and  all  prod¬ 
uct  preserving  functions  as  arrows.  There  is  strong  indication  that 
this  category  of  monoids  can  be  used  in  the  theory  of  automata, 
since  the  semigroups  have  already  been  used  there. 

The  category  of  groups  is  similarly  described;  the  vertices  are 
groups  and  the  arrows  f:  G  -»  H  are  the  homomorphisms  of  groups 
(that  is,  the  product  preserving  functions  on  one  group  G  to  another 
group  H).  Similar  constructions  apply  to  other  algebraic  systems, 
taking  appropriate  account  of  the  operations  used  to  define  these 
systems.  Thus  for  systems  described  by  two  operations  of  addition 
and  multiplication,  the  arrows  (the  homomorphisms)  are  the  func¬ 
tions  preserving  both  sums  and  products.  In  this  fashion  one  ob¬ 
tains  both  the  category  of  all  rings  and  that  of  all  fields.  Another 
important  category  is  that  of  vector  spaces;  the  vertices  of  the  cat¬ 
egory  are  the  vector  spaces  V  (with  real  scalars);  the  arrows  f:  V  — > 
W  are  the  functions  which  preserve  all  sums  and  preserve  multipli¬ 
cation  by  any  real  scalar  (such  a  function  is  just  a  linear  transforma¬ 
tion;  that  is,  a  function  given  in  terms  of  bases  by  a  matrix  of  real 
scalars). 

This  is  closely  related  to  the  category  of  matrices.  One  can  form 
the  product  BA  of  two  rectangular  .atriccs  of  real  numbers  only 
when  the  matrices  B  and  A  fit:  say  B  is  nXm  and  A  is  an  mxk 
matrix,  with  the  same  in.  This  fact  means  that  the  matrices  can  be 
viewed  as  the  arrows  of  a  category.  The  vertices  of  this  category  are 
the  integers  k  =  1,2,3,...;  an  mxk  matrix  A  is  an  arrow  from 
k  to  m.  The  composite  of  two  arrows  B:  in  -*  n  and  A:  k  -»  m  can 
be  formed  exactly  when  there  is  a  matrix  product  BA,  and  the 
composite  is  defined  to  be  exactly  this  matrix  product.  Then  axiom 
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(I)  for  a  category  is  the  usual  associative  law  for  matrix  multiplica¬ 
tion.  Moreover,  there  are  many  identity  matrices— one  nxn  identity 
for  each  size  n— and  these  identities  satisfy  axiom  (2). 

Thus  in  the  category  of  monoids  (or  of  groups,  or  of  rings)  the 
arrows  are  those  functions,  called  "homomorphisms",  which  preserve 
all  the  algebraic  operations  involved.  Often  the  arrows  of  any  cate¬ 
gory  are  called  its  “morphisms”.  Thus  for  mathematical  systems  of 
any  sort,  the  categorical  approach  always  asks:  “For  this  type  of 
system,  what  are  the  morphisms?”.  This  has  proved  especially  useful 
for  complex  types  of  systems  (e.g.,  for  manifolds  with  additional 
auxiliary  structures). 

A  category  C  itself  is  a  type  of  Mathematical  System.  Hence  we 
ask,  what  is  a  morphism  F:  C  -»  D  of  categories?  It  ought  to  be  a 
function  from  the  "elements”  of  C  to  those  of  D,  preserving  the 
algebraic  operations  involved.  Now  the  elements  are  vertices  and 
arrows,  and  the  operations  are  composition  and  the  construction  of 
identity  arrows.  Hence  the  morphism  F  must  assign  to  each  vertex 
P  of  C  a  vertex  F(P)  of  D  and  to  each  arrow  f:  P  ->  Q  of  C  an  arrow 
F(f):  F(P)  -»  F(Q)  of  D.  These  assignments  must  preserve  composi¬ 
tion  and  identities;  that  is,  they  must  satisfy 

(4)  F(g°f)  =  (Fg)0(Ff),  F(l*)  =  l„ 

whenever  the  composite  g°  f  of  the  arrows  g  and  f  is  defined.  Such  a 
morphism  F  of  categories  is  also  called  a  functor  on  C  to  D.  The 
defining  conditions  (4)  for  a  functor  may  be  phrased:  a  functor  takes 
all  diagrams  in  the  category  C  into  diagrams  of  the  same  shape  in  D. 

A  functor  is  thus  a  structure-preserving  map  from  one  category 
to  another  one.  The  two  categories  involved  may  be  of  very  different 
types;  in  particular,  there  is  often  occasion  to  consider  functors  from 
a  category  of  geometric  objects  (say,  metric  spaces,  topological  spaces, 
or  Riemannian  manifolds)  to  a  category  of  algebraic  objects  (say 
vector  spaces  or  groups).  Much  of  the  recent  development  of 
algebraic  topology  has  been  concerned  with  such  functors;  for  a 
given  space,  the  fundamental  group,  the  other  homotopy  groups, 
the  homology  groups,  and  the  cohomology  groups  all  provide  such 
functors.  Ten  years  ago  this  list  contained  essentially  all  the  known 
such  functors  passing  from  geometry  to  algebra.  In  the  last  ten  years 
the  presence  of  the  functorial  approach  has  been  a  great  stimulus 
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to  the  study  of  new  functors  of  this  sort.  Many  new  ones  have 
turned  up— the  so-called  generalized  homology  theories,  such  as 
K-theory  and  “cobordism”. 

The  functorial  approach  allows  a  precise  description  of  the  way 
in  which  a  geometric  problem  can  be  first  formulated  and  then 
solved  by  an  algebraic  translation  of  the  problem.  Steenrod2  has 
clearly  explained  how  this  occurs  with  the  use  of  cohomology  opera¬ 
tions  to  solve  homotopy  classification  problems.  A  typical  such  prob¬ 
lem  is  this:  given  two  (metric  or  topological)  spaces  X  and  Y  and  a 
continuous  function  f:  X  -*  Y,  when  is  this  function  “trivial*’,  in 
the  sense  that  it  can  be  continuously  deformed  to  another  function 
mapping  all  of  X  into  a  single  point  in  Y?  The  mechanism  of  alge¬ 
braic  translation  is  this:  take  a  functor  F  on  the  category  of  topo¬ 
logical  spaces  to  (say)  the  category  of  groups.  Then  F  assigns  to  the 
given  arrow  f:  X  -»  Y  between  spaces  an  image  arrow  F(f):  F(X)  -» 
F(Y)  between  groups.  It  usually  follows  easily  that  if  f  is  trivial, 
F(f)  must  also  be  zero  (but  not  vice  versa).  Hence  if  we  find  a  functor 
F  with  F(f)  9^  0,  we  know  that  the  given  arrow  f  is  not  trivial.  If 
we  find  only  F(f)  =  0,  we  do  not  know  whether  or  not  the  original 
f  was  trivial:  we  must  search  for  new  functors  F  with  greater  powers 
of  discrimination.  The  development  of  algebraic  topology  has  indeed 
led  to  the  successive  discovery  of  many  such  functors,  often  found  by 
taking  them  with  values  in  categories  of  algebraic  objects  with  suc¬ 
cessively  more  elaborate  structure. 

These  examples  suggest  that  the  notion  of  a  functor  is  useful  in 
describing  other  cases  of  the  translation  of  problems  anti  their  solu¬ 
tions  from  one  context  to  another.  There  are,  in  fact,  many  such 
examples.  An  important  one  in  differential  geometry  is  the  functor 
mapping  each  “Lie  Group"  to  its  “Lie  Algebra". 

The  notion  of  a  “universal  mapping”  is  another  general  idea 
closely  related  to  that  of  a  functor.  Suppose  we  are  working  within 
some  category,  and  that  some  of  the  vertices  in  this  category  have 
an  especially  desirable  good  property:  lit  us  call  these  special 
vertices  the  "good"  gadgets  or  just  the  "gadgets"  G.  Given  any 
vertex  l*.  we  want  an  arrow  from  I*  to  a  gadget.  Such  an  arrow  u: 
|»  G  is  called  a  universal  nutftfting  (for  the  given  class  of  "good" 
gadgets)  when  the  following  holds:  If  w:  I*  -*  G'  is  any  other  arrow 
starting  at  P  and  ending  at  a  gadget  G'.  then  there  is  exactly  one 
arrow  f:  G  — *  (»'  with  f4  u  —  w.  The  diagram  is 
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G!  p  , 
j  4°u=w 

u  is  universal  if  to  each  w  there  exists  (dotted  arrow)  a  unique  such 
f.  Put  differently:  every  w  factors  uniquely,  as  w  =  f  °  ut  through  the 
universal  arrow  u. 

There  are  many  examples  of  such  universal  arrows.  Some  occur 
in  the  construction  of  our  number  systems.  Let  us  consider  a  num¬ 
ber  system  to  be  a  set  (of  numbers)  with  the  two  usual  algebraic 
operations  of  addition  anil  multiplication.  An  arrow  between  two 
such  systems  will  accordingly  be  a  function  which  preserves  both 
sum  and  product  (i.c.,  a  morphism  both  of  addition  and  of  multipli¬ 
cation).  Here  are  some  familiar  such  systems: 


N.  all  natural  numbers 
Z.  all  integers 

Q.  all  rational  numbers 

R.  all  real  numbers 

c:.  all  complex  numbers 


0,  1.2.3,... 

0,  ±  l,  ±2,  ±3, . . . 

n/m  for  m  and  n  yk  0  integers 

x  -f  iy,  for  x  and  y  real. 


In  this  list  each  system  is  included  in  the  next,  and  in  each  ease  the 
imk:*-itm  N  Z.  Z  -*  Q,  etc.  can  be  regarded  as  an  arrow;  more 
explicitly,  an  arrow  in  our  category  of  systems  with  addition  and 

^duplication. 

Now  in  N.  the  system  of  natural  numbers,  subtraction  is  not 
always  possible.  (For  example,  die  equ  'Uun  a  4-  n  s  !  has  no  solu¬ 
tion  x.)  So  let  “gadget”  mean  a  system  in  wlu.lt  subtraction  N 
always  possible.  .Since  subtraction  is  possihle  in  tl.v  system  Z  of 
integers,  the  inclusion  u:  N  -»  Z  is  a  mapping  ol  N  to  a  gadget. 
One  tan  readily  show  that  it  is  the  universal  such  mapping;  any 
other  way  of  embedding  the  integers  in  a  gadget  (i.c.,  in  a  system 
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with  subtraction)  can  be  had  by  going  through  Z.  This  provides 
an  exact  conceptual  description  of  what  Z  is  and  reformulates  the 
familiar  observation  that  it  is  necessary  to  include  all  of  the  integers 
in  any  system  in  which  we  arc  to  have  subtraction  (i.e.,  every  nega¬ 
tive  integ*r  can  be  described  as  the  solution  of  some  subtraction 
problem). 

In  the  integers  Z  subtraction  is  possible,  but  division  by  non-zero 
integers  is  not  always  possible:  for  example,  2x  =  1  does  not  have 
an  integral  solution  x.  Clearly  the  rational  numbers  m/n  arc  intro¬ 
duced  exactly  in  order  to  have  solutions  to  such  equations  nx  =  m 
in  integers  m  and  n  0.  This  can  be  formulated  in  the  "universal'’ 
language  as  follows:  let  a  "gadget”  now  be  a  system  in  which  both 
subtraction  and  division  (by  non-zero  numbers)  is  possible.  Then 
the  system  Q  of  all  rational  numbers  is  a  gadget,  and  moreover,  the 
inclusion  u:  Z  — *  Q  of  the  integers  in  Q  is  a  universal  mapping 
One  may  prove  that  any  other  morphism  from  Z  to  a  gadget  must 
factor  uniquely  through  Q. 

In  the  same  way  one  can  show  that  the  inclusion  mapping  Q  — » 
R  into  the  teal  monitors  R  is  universal  for  mappings  from  Q  into 
systems  in  which  every  convergent  sequence  has  a  limit.  I'lius  R 
has  this  projterty  but  in  R  the  equation  x~  =  —1  has  no  root. 
This  equation  does  have  a  root  in  the  system  (i  of  complex  munlters 
(in  fact,  it  has  two  roots  x  =  -fi  and  x  =  — i  there).  This  accounts 
exactly  for  the  complex  numbers.  In  other  words,  the  inclusion 
R  -»  C  is  universal  among  mappings  from  the  reals  to  a  system 
where  x-  s  —  I  has  a  solution. 

Usually  the  complex  numbers  are  defined  as  numbers  x  -f-  iy. 
for  i  =  \/ - 1  and  x  anti  y  real  numbers,  with  certain  formulas 
given  to  define  addition  and  multiplication  of  these  munbeis.  Alter¬ 
natively,  the  complex  uumlter  x  +  »y  way  In*  described  simply  as 
an  ordered  pair  (x.y)  composed  ol  two  real  numbers,  again  with 
suitable  formulas  for  stuns  and  products  ol  pairs.  This  description 
may  be  made  more  geometric  by  saying  that  the  complex  number 
x  4*  iy  is  the  point  with  coordinates  (x,y)  in  the  plane.  As  iv  well 
known,  the  ojterations  of  addition  and  multiplication  can  then  In 
given  geometrically.  However,  all  of  these  descriptions  really  amount 
to  the  saute  thing,  and  in  every  case  the  redes  for  addition  and 
multiplication  can  tic  deduced  (tom  the  one  fact  that  the  arrow 
R  —*■  Cl  provides  a  mmkviW  solution  lot  X2  ss  —  I.  Now.  it  ratt 
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f?2  readily  seer  that  ail  arrow  solving  such  a  universal  problem  is 
essentially  uniquely  determined.  Hence  the  universality  of  R  ->  C 
is  a  complete  description  of  the  complex  numbers  C  in  terms  of  the 
teal  numbers. 

Stated  differently:  all  the  properties  of  the  complex  numbers  are 
deduced,  in  this  way  from  the  real  numbers  and  the  one  equation 
x-  —  —  1.  A  corresponding  process  can  he  used  to  construct  other 
number  systems.  Thus  if  we  start  from  the  system  Q  of  rational 
numbers  and.  any  polynomial  equation  which  cannot  be  factored 
there  (which  is  “irreducible”  over  Q),  we  obtain  a  universal  con¬ 
struction  which  gives  one  of  the  so-called  algebraic  number  fields. 
The  study  of  these  fields  is  a  flourishing  branch  of  algebra.  Again, 
if  we  start  with  airy  system  with  appropriate  operations  of  addition 
and  multiplication  (technically,  one  of  the  systems  called  a  ring) 
and  with  any  suitable  collection  f  of  elements  of  R,  we  obtain  a 
universal  mapping  R  — *  R/l  which  ends  at  a  new  ring  R  1,  called 
a  quotient  ring  of  R.  This  mapping  is  a  homomorphism  universal 
among  homomorphisms  starting  from  R  and  trivial  on  1.  (Techni¬ 
cally,  with  kernel  containing  the  "ideal”  1.) 

Universal  constructions  of  quotient  systems  like  this  apply  to 
other  parts  of  algebra.  In  group  theory,  one  uses  extensively  the 
quotient  group  G/'N  of  a  group  G  by  a  normal  subgroup  \  of  G. 
'I  bis  quotient  group  can  lie  completely  descrilied  by  a  universal 
property  of  the  projections  G  — *  G/N  (the  projection  is  that  arrow 
which  sends  each  element  of  G  to  its  ‘‘coset ”  modulo  N).  The 
formulas  (G  N)/(M/X)  a  G  M  and  other  "homutuotphistn”  litem- 
etm  involving  these  ipmtiews  can  be  proved  directly  from  the  uni¬ 
versality,  A  similar  description  applies  to  the  equivalence  classes  of 
a  set  modulo  reflexive,  symmetric,  amt  transitive  telaiion.  or  to  the 
epeotiem  of  a  space  obtained  K  “collapsing”  a  suhsjiace  to  a  point. 

The  familiar  process  of  sul,  -titutiug  constants  (or  variables  in  a 
polynomial  is  also  an  ex  mple  oi  universality.  To  In-  explicit,  take 
jmlvuomiaU  in  a  “vat  ie.de”  or  "letter”  x  with  integral  eoeflkients. 
such  as  the  polynomial 

ptxl  s  *1  4-  Tx  —  fix2  4-  3  x4. 

One  may  substitute  \  s?  2  to  get  pj2y  t-  I  +•  7*2  =-  It* l  4»  ss 
6  or  x  ss  tn  (or  any  integer  tu  lo  get 
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functions  include  all  the  familiar  ones  like  sin  x,  tan  x,  cos  2x,  .  .  . 
A  periodic  function  may  (like  these)  take  real  numbers  as  values,  or 
may  have  values  in  any  other  set. 

Consider  in  particular  the  function  p(x)  which  sends  each  real 
number  x  to  the  angle  6  with  measure  x  radians,  where  we  exhibit 
the  angles  by  the  corresponding  points  on  a  circle  of  radius  1 

ft*  line  - 2 - 1 - 9_ 


-s'*  unrf  circle. 


(In  the  figure,  O,  P,  Q  on  the  real  line  go  to  O',  P',  Q'  on  the 
circle.)  Thus  p(x)  is  the  function  which  “wraps”  the  real  line  R 
uniformly  around  the  circle  S1;  it  is  an  arrow  p:  R  S1  and  is  also 
clearly  periodic:  0.  2n,  itr,  .  .  .  are  all  wrapped  by  p  to  the  same 
point  O'  —  (1,0)  on  the  circle.  Now,  it  is  a  familiar  fact  that  any 
periodic  function  g(x)  can  be  considered  as  a  function  i(0)  of 
angles,  as  in  the  diagram 


This  states  that  any  periodic  g  can  be  factored  as  shown  through 
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the  universal  periodic  function  p.  In  this  way  familiar  facts  about 
angles  are  examples  of  universality.  Indeed,  this  universality  really 
can  be  viewed  as  a  construction  of  die  circle  S1  from  the  line  by  the 
congruence  relations  “x  and  x'  differ  by  an  integral  multiple  of  2*-.” 

The  basic  notion  of  a  subset  S  of  a  set  X  (in  symbols  S  C  X)  can 
be  formulated  in  terms  of  universality.  Take  the  particular  set  (0,  1} 
consisting  of  just  two  elements,  0  and  1.  Take  the  subset  {0}  C 
(0,  1}  consisting  of  0  alone.  This  set  {0}  is  a  “universal”  subset  in 
the  following  sense.  For  any  subset  S  of  any  X  we  can  describe  a 
corresponding  “characteristic  function”  CS:X  -»  {0,1}  with  values 
in  the  special  set  {0,1 }  as  the  function  with 

Cs(x)  —  0  when  x  is  in  S 

=  1  when  x  is  not  in  S. 

Now  for  any  function  f:  X  — »  {0,1}  one  can  describe  the  "inverse 
image”  of  the  subset  {0}  to  be  the  collection  of  exactly  those 
elements  s  of  X  for  which  f(s)  =  0  (those  elements  which  map  to  zero 
under  f).  For  the  characteristic  function  Cs,  the  inverse  image  of 
{0}  is  exactly  the  given  subset  S,  and  it  is  the  only  function  f:  X 
{0,1}  with  this  inverse  image  for  {0}.  This  can  be  exhibited  in  the 
figure 


D 


C 


s 

A 
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which  states  (using  the  “inverse”  effect  on  subsets)  that  {0}  is  a 
universal  subset.  (The  fact  that  an  arrow  goes  backwards  here  is 
typical  of  “contravariant”  effects  in  many  such  cases.) 
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The  basic  axioms  tor  the  arithmetic  of  natural  numbers  can  also 
be  formulated  in  terms  of  a  universal  mapping.  For  this  purpose,  we 
let  a  “gadget”  consist  of  a  set  X  together  with  a  function  t:  X  — »  X 
mapping  X  onto  itself.  For  example,  X  might  be  the  set  {0,1,2}  with 
just  three  elements  0,1,2  while  t  is  the  function  sending  each  element 
to  the  next  {and  the  last  back  to  0).  Thus  the  effect  of  t  can  be  writ¬ 
ten  out  as  a  long  sequence 

0  — >  1  — >  2  — >  0  — >  l  — 2  — >  0  ...  . 

Again,  X  might  be  the  same  set  {0,1,2}  and  t  the  function  sending  0 
to  1,  I  to  2,  and  2  back  to  1.  The  effect  of  t  is  then  indicated  by  the 
following  sequence 

0_>  1  — >  2  ->  1  — »  2  — »  1  -...  . 

This  suggests  that  the  “typical”  gadget  with  a  chosen  starting  point 
is  represented  by  a  sequence  (which  may  or  may  not  repeat  itself). 
The  universal  such  sequence  ought  to  be  the  sequence 

N:0— »  1  — >  2— »•  3— » 4  — »  ...  . 


This  is  just  the  gadget  consisting  of  the  set  N  of  all  natural  numbers 
together  with  the  function  s:  N  ->  N  which  sends  each  natural  num¬ 
ber  n  to  its  “successor”  n  1.  Now  this  inclusion  of  0  in  N  is  an 
arrow’  u:  {0}  (N,  s)  from  {0}  to  this  gadget  N.  We  claim  that  it 

is  universal:  Given  any  arrow  w:  {0}  — *  (X,  t)  from  the  set  {0}  to  a 
gadget  in  our  present  sense,  there  is  a  unique  function  f:  N  — >  X 
which  is  a  morphism  of  these  gadgets  and  which  has  f(0)  =  w(0): 


Indeed,  to  say  that  f:  N  ->  X  is  a  map  of  gadgets  means  exactly 
that  f  0  s  =  t  o  f,  so  this  map  f  is  given  by 

f(0)  =  w(0),  f(l)  =  t(w0),  f(2)  =  t(t(w(0))),  . . . 
and  in  general  f(m)  is  the  result  of  applying  t  m  times  to  w'(0). 
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This  one  property  of  N  and  the  successor  function  s:  N  — >  N  can 
be  used  as  a  complete  description  of  the  natural  numbers.  (It  is 
equivalent  to  the  usual  Peano  postulates  for  N.)  In  particular,  this 
property  of  N  can  be  used  to  construct  the  operations  of  addition 
and  subtraction.  When  combined  with  other  properties,  it  gives  a 
new  system1  of  axioms  for  set  theory. 

We  have  thus  given  a  number  of  examples  of  universal  mappings: 
the  basic  property  just  given  for  the  system  of  natural  numbers;  the 
arrows  including  this  system  in  successively  larger  systems  of  integers, 
rational  numbers,  real  numbers,  and  complex  numbers;  the  arrows 
mapping  a  system  on  a  quotient  system,  or  constructing  polynomials, 
or  describing  subsets  by  characteristic  functions,  or  mapping  num¬ 
bers  upon  angles  so  as  to  get  a  universal  periodic  function.  There 
are  many  other  examples  of  universal  arrows  in  other  parts  of 
mathematics,  as  well  as  examples  of  a  related  notion  called  an  “ad¬ 
joint”  pair  of  functors,  and  there  is  already  a  considerable  roster 
of  theorems  about  such  functors. 

Indeed,  functors  occur  naturally  in  connection  with  universal 
mappings  u:  P  ->  G.  The  given  class  of  good  gadgets  G  forms  a 
category  C  with  vertices  the  gadgets  G  and  arrows  the  arrows  f: 
G  — »  G'.  There  is  a  functor  F  on  this  category  to  the  category  of 
sets  and  described  in  terms  of  the  given  vertex  P:  This  functor  as¬ 
signs  to  each  gadget  G  the  set  of  all  arrows  v:  P  G  from  the  fixed 
P  to  this  G,  and  to  each  arrow  f:  G  — »  G'  the  function  F(f):F(G)  -> 
F(G')  which  sends  each  arrow  v  to  f  0  v:  P  —>  G\  The  universal 
arrow  u:  P  -»  G  may  be  described  as  a  universal  element  of  this 
functor;  this  description  is  effectively  used  in  proving  theorems 
about  universal  arrows  and  other  adjoint  functors.  For  example,  the 
reader  might  try  to  prove  that,  given  P,  the  universal  mapping  u: 
P  — G  is  essentially  unique.  Here  “essentially”  means  the  follow¬ 
ing:  If  u*:  P  G*  is  another  universal  mapping  to  some  other  good 
gadget  G*,  then  G*  is  isomorphic  to  G  in  the  exact  sense  that  there 
are  arrows  f:  G  -»  G*  and  f*:  G*  G  with  f  °  u  =  u*  and  f*  °  u* 
=  u  as  well  as 


f  ®f*  =  1:  G*-»G*,f*  o  f=  1:G-»G. 

These  last  equations  state  that  f#  is  the  (two-sided)  inverse  of  f. 

With  this  we  end  our  introduction  to  mappings  and  categories. 
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The  subject  of  category  theory  and  adjoint  functors  is  growing  so 
rapidly  that  printed  information  is  largely  out-of-date,  but  there  are 
a  number  of  available  further  sources.  An  elementary  description  of 
the  universal  elements  of  functors  and  their  manifold  applications 
to  algebra  may  be  found  in  a  text  by  Mac  Lane  and  Birkhoff.3  For 
a  more  inclusive  presentation  of  category  theory  one  may  consult 
the  sprightly  monograph  by  Freyd,4  the  systematic  treatment  by 
Mitchell,5  or  the  expository  article  by  Mac  Lane.6  The  state  of 
present  day  research  on  the  subject  may  be  sampled  in  the  report 
on  a  recent  AFOSR-sponsored  conference.7  The  applications  of  cate¬ 
gory  theory  to  be  found  there  range  from  topology  through  algebra 
to  automata.  Many  more  applications  and  developments  are  to 
be  expected. 

At  the  coming  summer  meeting  of  the  American  Mathematical 
Society,  the  major  series  of  lectures  (the  Colloquium  lectures)  will 
be  given  by  Samuel  Eilenberg,  on  the  application  of  categories  and 
functors  to  automata.  Current  investigation  of  Lawvere  (not  yet 
published)  indicate  a  real  possibility  of  a  “categorical  dynamics”  in 
which  many  of  the  standard  concepts  of  dynamics  can  be  formulated 
in  a  simpler  and  more  powerful  form. 
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X.  Photochemical  Systems 

George  S.  Hammond 

I  believe  that  photochemical  changes,  chemical  changes  caused 
by  absorption  of  light,  have  great  potential  for  use  as  working  parts 
of  complex  systems.  Chemical  reactions  in  general  have  great  prom¬ 
ise  for  use  as  gears  or  driving  parts  in  machines,  so  photochemistry 
is  only  a  special  case  with  especially  attractive  features  of  fast 
response  and  precision  control.  To  me  it  is  amazing  that  increasing 
understanding  of  biochemical  processes  has  not  inspired  more  seri¬ 
ous  thought  about  the  construction  of  man-made  chemical  ma¬ 
chines.  I  do  not  have  in  mind  just  mimicking  the  chemical  ma¬ 
chinery  of  living  systems  but  believe  that  there  is  real  merit  in 
using  the  example  to  illustrate  the  complexity  and  compactness 
available  in  chemical  systems.  We  already  have  outstanding  ex¬ 
amples  of  the  use  of  chemical  change  in  complex  systems;  the  lead 
storage  battery  is  a  familiar  case.  However,  the  general  concept  has 
seldom  been  spelled  out  and  considered  seriously. 

In  this  lecture  I  plan  to  do  two  things,  both  in  a  rather  superficial 
manner.  First,  I  will  describe  some  of  the  types  of  physical  and 
chemical  processes  that  seem  important  in  photochemical  systems.  In 
addition,  I  will  add  some  thoughts  concerning  conceivable  applica¬ 
tions  in  useful  systems.  I  should  note  that  a  principal  interest  in 
photochemistry  to  chemists  is  still  development  of  new  and  interest¬ 
ing  synthetic  methods.  Applications  to  synthesis  will  not  be  stressed 
in  this  presentation,  merely  because  the  inviting  prospects  of  syn¬ 
thetic  photochemistry  are  already  well-recognized. 

In  Table  I  we  have  listed  very  briefly  some  uses  and  significances 
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of  photochemistry,  in  addition  to  application  in  chemical  synthesis. 
Monitoring  of  light  is  an  obvious  use  which  can  be  put  to  good  service 
in  control  systems  and  can  also  be  used  to  gather  information  con¬ 
cerning  the  intensity  and  character  of  the  light  bath  in  places  that 
may  be  somewhat  inaccessible:  for  example,  in  outer  space. 

TABLE  I.  Significance  of  Photochemistry 

1.  Monitoring  of  light 

a.  Intensity 

b.  Wavelengths 

2.  Effects  on  materials 

3.  Utilization  of  solar  energy 

There  are  a  number  of  ways  in  which  monitoring  can  be  done. 
First  of  all,  measurement  of  total  chemical  response  can  provide 
either  an  integral  or  differential  measure  of  light  intensity.  Systems 
that  undergo  irreversible  photochemical  change  will  give  a  perma¬ 
nent  record  of  the  total  amount  of  activating  light  received  by  a 
sample.  On  the  other  hand,  systems  that  are  highly  reversible  reach 
stationary  states  in  which  a  chemical  reaction  A  B  is  forced  by 
absorption  of  light  and  the  reverse  process  B  ->  A  is  an  ordinary 
thermal  reaction  that  attempts  to  force  reversion  to  A.  The  ratio 
A  to  B  at  a  stationary  state  therefore  can  provide  a  measure  of  the 
instantaneous  intensity  of  illumination. 

Both  kinds  of  monitoring  can  be  made  either  selective  or  non- 
selective  with  respect  to  the  wavelength  of  the  exciting  light.  This, 
of  course,  arises  from  the  fact  that  most  photochemically  active  com¬ 
pounds  have  distinctive  absorption  spectra  throughout  the  visible 
and  ultraviolet  spectral  regions.  Obviously,  if  one  wants  to  gain 
specific  information  concerning  a  particular  spectral  region  one  has 
only  to  choose  a  photochemical  monitor  that  absorbs  exclusively,  or 
nearly  exclusively,  in  that  region.  The  spectral  characteristics  of  the 
monitoring  system  can  be  supplemented  by  use  of  suitable  chemical 
or  optical  light  filters. 

The  second  category  listed  in  Table  I  is  enormously  broad,  as  are 
most  subjects  entitled  "effects”.  A  good  many  of  the  well-known 
effects  of  light  are  the  damaging  consequences  of  exposure  of  many 
kinds  of  materials  to  sunlight.  These,  of  course,  include  effects  on 
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people;  sunburns  and  skin  cancers  are  two  well-known  examples  of 
damage  to  people  caused  by  photochemistry.  Effects  on  other  materials 
are  also  numerous.  Nearly  everyone,  and  certainly  most  people  living 
in  Southern  California,  are  acutely  aware  of  the  fact  that  bright  sun¬ 
light  is  responsible  for  triggering  the  oxidation  reactions  that  turn 
organic  vapors  into  the  noxious  mixture  known  as  smog.  In  a  less 
dramatic  way,  photochemical  oxidation  initiates  changes  in  most 
common  organic  materials  that  we  encounter  in  everyday  life;  for 
example,  the  working  lifetimes  of  many  synthetic  fibers,  plastic  films, 
paints,  and  varnishes  might  be  extended  almost  indefinitely  if  they 
were  always  kept  in  the  dark.  However,  I  would  not  want  to  leave  the 
impression  that  “effects"  are  all  undesirable.  Amazingly  useful  effects 
can  be  caused  by  exposure  of  some  kinds  of  substances  to  light.  An 
intriguing  example  is  the  phenomenon  of  photoconductivity.  Some 
substances  which  are  normally  nonconducting  become  excellent 
electrical  conductors  when  exposed  to  light. 

Finally,  I  will  admit  to  having  some  fairly  grandiose  notions  con¬ 
cerning  the  possibility  of  using  man-contrived  photochemistry  to 
harvest  and  store  solar  energy.  I  do  not  have  in  mind  the  simple 
replication  of  the  photosynthetic  process  of  green  plants  but  think  in 
terms  of  development  of  new  photochemical  systems,  which  will  sup¬ 
plement  the  contribution  of  natural  photosynthesis  by  being  applic¬ 
able  under  conditions  where  the  green  plant  does  not  fare  well.  Good 
examples  can  be  found  on  much  of  the  surface  of  this  planet,  since 
the  total  dependence  of  green  plants  on  water  prohibits  our  getting 
much  useful  yield  from  solar  energy  that  falls  on  arid  portions  of  the 
earth’s  surface.  Another  even  more  glamorous  example  is  found  in 
the  conditions  existing  in  most  places  removed  from  the  surface  of 
the  earth.  For  example,  if  we  ever  do  colonize  the  moon  we  will  have 
to  establish  some  sort  of  a  sclfcontained  energy  economy  there.  As  far 
as  I  can  see,  there  are  only  two  likely  candidates  as  a  basis  for  that 
economy.  First,  and  perhaps  most  likely,  is  the  tapping  of  nuclear 
energy,  but  the  only  other  prospect  that  I  can  take  seriously  is  man¬ 
made  photochemical  systems  which  arc  not  dependent  upon  the 
vulgar  use  of  water.  Incidentally,  such  systems  will  probably  be 
designed  to  exploit  the  high-energy  light  from  the  sun  that  does  not 
pass  through  the  earth’s  atmosphere.  Even  if  green  plants  could  be 
trained  to  survive  without  water,  there  is  considerable  doubt  as  to 
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whether  or  not  they  could  be  educated  to  take  advantage  of  the  light 
of  wavelengths  that  bathe  the  surface  of  the  moon  generously  but 
never  reach  the  surface  of  the  earth. 

Photochemistry,  of  course,  begins  with  the  absorption  of  light.  In 
order  to  discuss  the  absorption  process  it  is  necessary  to  consider  in  a 
little  detail  what  happens  to  a  molecule  when  it  acquires  a  quantum 
of  light  energy.  Figure  1  shows  very  schematically  the  kind  of  think¬ 
ing  that  we  use  in  analyzing  the  absorption  process  and  its  relation¬ 
ship  to  ultimate  chemical  change.  First  of  all,  the  energy  gained  by 
an  absorbing  system  is  usually  highly  localized  in  individual  mole¬ 
cules.  So,  we  can  discuss  the  process  of  excitation  in  terms  of  the 
excited  states  of  molecules.  Most  of  the  excitation  energy  is  acquired 
by  electrons,  and  the  process  is  therefore  referred  to  as  electronic 
excitation.  This  is  true  of  visible  and  ultraviolet  light,  which  are  the 
only  sources  involved  in  any  known  photochemical  change.  Infrared 
radiation  is  absorbed  by  molecules  but  goes  into  exciting  vibrations 
and  rotations  and  is  very  rapidly  lost  into  the  thermal  pool  of  the 
system.  In  short,  infrared  radiation  is  really  only  useful  for  warming 
up  an  entire  system  and  does  not  selectively  promote  individual 
molecules  to  very  high  levs  Is  of  excitation. 

In  Figure  1  the  symbol  Sa  is  used  to  indicate  the  ground  state  of  a 
molecule-  The  S  stands  for  "singlet”.  This  simply  means  that  most 
common  molecules  contain  even  numbers  of  electrons  and  those 
electrons  have  their  spins  paired.  The  following  is  a  schematic  pre¬ 
sentation  of  the  usual  significance  of  the  terms  singlet  and  triplet, 
which  are  commonly  encountered  in  discussions  of  the  mechanisms 
of  photochemical  reactions. 
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Molecules  are  described  by  assigning  electrons  to  orbitals  in  pairs. 
The  assignment  is  done  so  as  to  use  the  orbitals  of  the  system  having 
the  lowest  energy.  These  orbitals  may  be  of  various  kinds;  such  as 
localized  bonding  orbitals  in  a  molecule  such  as  methane,  or  non¬ 
binding  orbitals  such  as  those  occupied  by  nonbonding  electrons 
in  a  water  molecule,  or  they  may  be  delocalized  molecular  orbitals 
covering  large  portions  of  a  molecule  as  are  found  in  highly  unsatu¬ 
rated  compounds  such  as  benzene.  For  our  present  purposes,  it  is  not 


-  Radiative  transitions 

Ft&tmc  1.  Excitation  Pratm 

very  important  to  worry  about  the  detailed  description  of  the  m  bilab 
involved  in  excitation.  However,  development  of  viable  theoties  con¬ 
cerning  the  relationship  between  structure  and  reactivity  in  photo¬ 
chemistry  will  require  intimate  understanding  of  such  characteristics. 
As  is  shown  above,  absorption  of  light  promotes  an  electron  from  an 
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occupied  orbital  to  one  that  is  unused  in  the  ground  state  of  the 
molecule.  The  excited  state  is  designated  as  St,  if  it  is  the  lowest  lying 
excited  singlet  state  of  the  molecule.  The  act  of  absorption  of  light 
by  a  singlet  molecule  ordinarily  leads  only  to  the  production  of 
excited  singlet  states  as  primary  products.  This  is  one  of  the  very 
good  selection  rules  that  has  been  known  for  a  very  long  time  by 
spectroscopists.  For  all  practical  purposes,  it  is  completely  valid  for 
molecules  that  contain  only  light  atoms.  This  includes  the  majority  of 
organic  compounds  which  are  currently  of  most  interest  to  photo¬ 
chemists.  As  I  have  shown  above,  an  excited  singlet  state  has  electrons 
which  are  orbitally  unpaired  but  still  have  paired  spins.  Since  the  two 
unpaired  electrons  occupy  different  orbitals,  they  become  free  to 
assume  parallel  as  well  as  antiparallel  spin  states.  If  one  of  the  elec¬ 
trons  undergoes  a  spin  flip  so  that  the  pair  spin  in  the  same  direction, 
the  molecule  as  a  whole  will  possess  an  electronic  angular  momentum. 
In  a  strong  magnetic  field,  the  magnetic  moment  of  such  a  molecule 
can  assume  one  of  three  distinctly  different  orientations  with  respect 
to  the  direction  of  the  applied  field.  There  are  really  three  states 
rather  than  one,  and  this  is  the  reason  that  the  configuration  that  I 
have  labeled  above  as  T,  is  called  a  triplet. 

Let  us  return  to  Figure  1  and  trace  the  fate  of  a  molecule  that  is 
excited  by  absorption  of  short  wavelength  light  to  some  excited 
state  having  a  higher  energy  content  than  the  lowest  excited  singlet. 
Following  the  course  of  the  straight  arrow  in  the  figure,  we  will 
produce  an  Sa  molecule  and,  along  with  the  electronic  excitation, 
we  will  put  in  a  certain  amount  of  vibrational  energy.  In  other 
words,  is  not  only  electronically  excited  but  is  also  vibrationally 
excited.  This  multiply  excited  species  will  then  undergo  very  rapid 
decay  processes  in  which  some  of  the  excitation  is  transferred  to 
other  components  of  the  system  and  simply  appears  as  thermal 
energy.  ‘Lite  fastest  of  these  relaxation  rates  is  probably  vibrational 
energy  transfer  represented  by  the  short  wavy  arrow  at  the  t*p  of 
the  figure,  litis  produces  a  vibrationally  unexcited  Js3  molecule. 
Such  a  state  may  occasionally  live  long  enough  to  undergo  an  inter- 
system  crossing  to  produce  an  excited  triplet,  such  as  Y3.  However, 
the  most  common  fate  is  simple  shedding  of  electronic  excitation 
in  a  process  called  internal  wmwsfan  with  the  production  of  an 
St  molecule. 

One  of  the  remarkable  phenomena  oi  excited  state  ‘.cmUtry  b 
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the  extraordinary  speed  with  which  higher  excited  states  lose  some 
of  their  excitation  and  relax  to  the  lowest  excited  singlet  state.  For¬ 
tunately  for  photochemistry,  the  lifetimes  of  St  states  arc  often 
much  longer.  These  molecules  frequently  live  long  enough  to  emit 
light  called  fluorescence  or  to  undergo  intersystem  crossing  to  form 
triplets  and,  in  some  cases,  take  part  in  chemical  reactions.  How¬ 
ever,  the  chemically  most  significant  fate  of  S,  is  intersystem  crossing 
to  triplets.  The  T,  state  of  most  molecules  turns  out  to  be,  by  a  very 
wide  margin,  the  longest-lived  electronically  excited  state.  This  is 
because  return  from  Ti  to  So  requires  not  only  transfer  of  a  large 
amount  of  energy  to  other  parts  of  the  system  but  also  a  change  in 
the  spin  state  of  one  of  the  electrons.  Nature  finds  the  changing  of 
electrical  spin  functions  awkward,  irrespective  of  whether  or  not 
the  change  is  radiative  or  nonradiative.  The  radiative  decay  of 
excited  triplet  states  is  sometimes  observed  and  is  resjjonsible  for 
the  long-lived  emission  of  many  molecules  called  phosphorescence. 
Study  of  the  phosphorescence  spectra  provides  the  most  generally 
valuable  tool  for  gathering  information  concerning  the  structure  of 
triplet  states. 

In  addition  to  the  processes  indicated  in  Figure  1,  one  should  add 
chemical  change  and  transfer  of  electronic  excitation  to  other 
molecules.  The  former  is,  of  course,  photochemistry.  Electronically 
excited  molecules  have  very  large  internal  energy  contents  and  are. 
therefore,  potentially  capable  of  undergoing  many  chemical  changes. 
These'  may  include  both  uniroolecular  reactions  in  which  the  ex> 
cited  state  simply  falls  apart  or  turns  into  some  new  compound,  or 
the  .'taction  may  be  bimoleeular.  In  bimolecula?  reactions  the  ex- 
cited  state  usually  reacts  with  some  other  molecule  in  its  ground 
state  although  reactions  between  two  excited  states  are  occasionally 
observed. 

The  transfer  of  electronic  excitation  is  most  easily  detected  when 
the  energy  acceptor  is  different  from  the  molecule  that  originally 
absorbed  the  exciting  light,  t  ransfer  can  be  readily  studied  if  the 
energy  acceptor  undergoes  some  characteristic  reaction  or  if  light 
emission  from  the  acceptor  can  be  observed.  When  energy  transfer 
restrict  in  chemical  transformation  of  the  acceptor,  the  process  is 
usually  called  a  photosensitized  reaction,  t  he  following  is  a  gener¬ 
alized  mechanism  for  photosensitized  reactions: 


226  JOURNEYS  IN  SCIENCE 


^  *S| 

%  ♦  %  *£*  ♦  \ 

%  — ^a-»  ^Sj 

*S,  +  *Aj  ♦  ^ 

k_ 

*A,  or  *A, — E — »  Chemical  Reaction 

In  our  own  research  group,  we  have  been  very  much  interested  in 
the  study  of  sensitized  reactions,  both  because  they  arc  inherently 
intriguing  and  because  they  offer  very  attractive  prospects  for  con¬ 
trol  and  manipulation  of  photochemistry.  For  example,  we  can 
sometimes  make  excited  states  of  acceptor  molecules,  even  though 
they  canuot  be  produced  by  direct  optical  excitation  of  tlte  accejuor. 
A  good  example  is  the  case  of  butadiene: 

ch3=chc;h=chu 

If  butadiene  absorbs  light  jt  is  promoted  to  an  excited  singlet  state, 
hut  decay  directly  back  to  the  sruglet  ground  state  is  so  rapid  that 
die  molecule  does  net  undergo  imeriysieni  crossing  to  a  triplet  state 
with  any  detectable  efficiency.  However,  excited  triplet  states  of 
butadiene  can  be  tnade  very  readily  by  energy  transfer  from  the 
excited  triplets  of  any  one  of  a  large  number  of  photosensittms. 

A  rather  intriguing  method  for  study  of  energy  transfer  can  be 
based  upon  the  use  of  bichromopbortc  molecules  in  which  there 
arc  two  nearly  isolated,  unsaturafed  units  A  lew  years  ago  in  col¬ 
laboration  with  Hr.  Peter  Leertnakers  and  his  group  at  Wesleyan 
University,  we  used  both  spectroscopic  and  chemical  methods  to 
study  internal  transfer  in  a  series  of  compounds  containing  both  a 
naphthalene-like  unit  and  a  unit  resembling  beiwopluenoue,  a 
common  photosensitiret- 

Figure  2  shows  the  ultraviolet  absorption  spectrum  of  one  of  our 
bkbromophortc  com|iounds.  The  spectrum  looks  very  much  like 
that  obtained  from  a  mixture  of  na|dtthalene  and  be«wophemme.  or 
more  property  l-methylnaphthalcnc  and  4-methyl benaophenoue. 
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Blchromopixoric  series,  a  *  I,  2,  3 

The  broken  Hue  in  Figure  2  shows  the  deviation  ot  the  spectrum 
hum  that  calculated  assuming  that  the  two  unsaturated  units  do 
uot  interact  at  all  during  the  process  of  light  absorption.  It  is 
obvious  that,  although  there  is  a  small  amuum  pi  absorption  in 
addition  to  what  might  have  been  predicted,  there  is  no  evidence 
of  any  very  large  interaction.  However,  the  emission  spectra  of  the 
complex  molecules  are  rjuite  different  from  those  of  the  simple 
models. 

Naphthalene  and  most  of  its  simple  derivatives  emit  fluorescence 
after  light  absorption.  Furthermore,  after  solutions  are  cooled  to 
sutfaieotly  low  temperatures  so  that  they  become  rigid  glasses, 
naphthalenes  also  show  strong  phosphorescence  emission  arising 
from  the  lowest  excited  triplet  state,  flencophenone,  on  the  other 
fund,  shows  no  detectable  fluorescence,  but  in  glassy  matrices  it  has 
a  very  strong  phosphorescence.  Obviously,  both  compounds  undergo 
intetsystem  crossing  from  excited  singlets  to  triplets.  Furthermore, 
it  appears  that  the  intetsystem  crossing  of  betwophenone  is  so  rapid 
that  it  entirely  precludes  fluorescence. 

In  Figure  5  we  show  the  phosphorescence  emission  obtained  from 
methylnaphthalcne,  from  a  mixture  of  metbylnaphthalene  and 
methyiberwopheuone,  and  from  one  of  the  biehromophotic  com* 
pounds.  Phosphorescence  from  the  two-componeftt  mixture  is 
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Figure  2. 


PHOTOCHEMICAL  SYSTEMS  229 


Figure  3. 

identical  to  that  expected  from  the  two  substances  acting  entirely  in¬ 
dependently  of  one  another.  The  large  peaks  shown  in  the  trace  are 
the  characteristic  highly  structured  emission  from  benzophenone 
and  its  simple  derivatives.  Notice  that  this  emission  is  completely 
absent  from  the  spectrum  of  the  bichromophoric  material.  In  fact, 
the  latter  has  an  emission  that  matches  almost  exactly  the  phos¬ 
phorescence  of  methylnaphthalene,  If  we  look  at  a  higher  frequency 
where  fluorescence  of  naphthalene  is  ordinarily  found,  we  find  a 
very  weak  emission  from  the  complex  molecule  that  looks  like 
highly  attenuated  naphthalene  fluorescence.  This  is  not  shown  in 
the  figure  because  the  emission  is  so  very  small  in  comparison  to 
either  unperturbed  naphthalene  fluorescence  or  the  strong  phos¬ 
phorescence  emission. 

The  story  that  we  tell  concerning  the  fate  of  the  light  energy  put 
into  the  bichromophoric  compound  is  as  follows:  light  is  absorbed 
by  the  naphthalene  unit  and  produces  an  excited  singlet  state  that 
is  essentially  naphthalene-like  in  character.  This  state  has  a  suffi¬ 
cient  lifetime  to  allow  it  to  emit  a  very  small  amount  of  light.  How¬ 
ever,  most  of  the  energy  is  rapidly  lost  by  being  transferred  to  the 
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other  chromophore.  There  it  produces  an  excited  state  that  is  essen¬ 
tially  the  same  as  the  excited  singlet  state  of  benzophenone.  Inter¬ 
system  crossing  to  produce  a  benzophenone-like  triplet  occurs  very 
rapidly,  just  as  it  would  in  benzophenone  itself.  However,  the  ener¬ 
gy  is  then  very  quickly  transferred  back  to  the  naphthalene  units 
forming  naphthaiene-like  triplets.  Light  is  then  emitted  in  a  spec¬ 
trum  which  matches  almost  exactly  the  phosphorescence  spectrum  of 
methylnaphthalene.  This  double  shuttling  of  energy  is  possible 
because  the  separation  between  naphthalene  singlet  and  triplet  states 
is  very  much  larger  than  the  splitting  between  benzophenone  singlet 
and  triplet  states.  The  following  diagram  shows  the  way  in  which 
each  of  the  energy  transfers  can  be  energetically  downhill. 


Naphthalene  Benzophenone 

Most  of  our  study  of  energy  transfer  has  involved  bimolecular 
energy  exchange  rather  than  internal  shuttling  of  the  excitation.  In 
this  work  we  normally  use  the  quantum  yields  of  photosensitized 
reactions  as  a  primary  tool.  However,  there  is  another  scheme 
which  is  interesting  and  shows  some  rather  dramatic  results.  The 
following  equations  show  a  commonly  expected  mechanism  for 
photosensitized  interconversion  of  cis  and  tram  isomers  of  olefmic 
compounds. 

S*  kj 

S*1'1  *  C=c  — - ♦  T  +  S  (1) 

s-  substrate  ' 

triplet 


trans 
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R. 


S* 


(3) 


c=c 


H  H 

cis 


^3 


T  +  S 


(2) 


->  trans 


->  cis 


(3) 

(4) 


In  such  a  system  prolonged  irradiation  leads  to  establishment  of  a 
“photostationary  state”.  This  state,  which  is  sometimes  referred  to 
as  a  photoequilibrium,  represents  a  situation  in  which  the  rates  of 
the  two  opposing  reactions  have  been  made  equal. 


R 


c-t 


=  R. 


t-C 


The  composition  of  the  system  at  the  stationary  state  depends  upon 
the  relative  reactivities  of  the  cis  and  trans  isomers  as  energy  ac¬ 
ceptors  and  on  the  relative  amounts  of  the  two  compounds  formed 
by  decay  of  the  excited  triplet  state. 


[cjs]s 

f  trans! , 


(5) 


The  photostationary  mixture  is  obviously  a  complex  function  of 
reactivity  factors.  What  we  would  like  to  obtain  from  its  study  is 
information  concerning  the  variation  in  the  rates  of  energy  transfer, 
kt  and  k2.  We  have  used  a  trick  which  seems  to  be  partially  success¬ 
ful  in  disentangling  the  complex  stationary  state  relationship.  First 
of  all,  it  is  observed  in  this  and  many  other  kinds  of  studies  that  the 
rate  of  energy  transfer  is  diffusion  controlled  in  solution  if  the  ex¬ 
citation  energy  available  in  the  donor  exceeds  by  more  than  a  few 
kilocalories  per  mole  the  energy  required  to  promote  the  acceptor 
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to  its  lowest  excited  triplet  state.  This  is  shown  schematically  in 
Figure  4.  Consequently,  for  high-energy  sensitizers  we  expect  that  kj 
will  be  equal  to  k2  and  that  the  excitation  ratio  will  therefore  be 
unity.  In  such  cases,  the  photostationary  state  composition  should 


provide  a  direct  measure  of  the  decay  ratio  X  This  expectation 

k3 

seems  to  be  verified  by  experiment.  The  photostationary  state  com¬ 
position  of  a  number  of  substrates  is  essentially  invariant  although 


AEg 


Sj  +  X  - *  S0  +  *A, 

I _ f 

AEa 

ky.  st  5  x  io9  f  mole  1  sec"1 

if  AEg  a  AE^  +  3-5  kcal  mole'1 
Figure  4.  Reactivity  in  Triplet  Energy  Transfer. 

many  different  high-energy  sensitizers  have  been  used  to  establish  it. 
We  can  then  play  the  game  backwards  and  assume  that  the  decay 
ratio  remains  unchanged  when  we  use  sensitizers  having  lower  exci¬ 
tation  energies.  Changes  in  the  photostationary  state  will  then  tell 
us  how  the  excitation  ratio  changes  as  the  energy  available  in  the 
sensitizer  is  decreased.  We  can  make  a  straightforward  prediction 
of  what  will  happen,  as  is  shown  in  Figure  5. 

Figure  5  is  drawn  for  a  case  in  which  the  energy  required  to  excite 
the  cis  isomer  is  higher  than  that  needed  to  excite  the  trans  com¬ 
pound.  As  we  lower  the  sensitizer  energy,  by  varying  the  structure  of 
the  sensitizers  we  expect  that  the  nature  of  the  transfer  to  the  cis 
isomer  will  begin  to  fall  off  while  the  rate  of  transfer  to  the  trans 
isomer  is  still  at,  or  close  to,  the  diffusion-controlled  limit.  The 
photostationary  mixtures  will  then  become  eij-rich  and  the  system 
will  be  optically  pumped  in  the  transacts  direction.  This  trend  should 
continue  until  the  excitation  energy  of  the  trans  isomer  is  ap- 


PHOTOCHEMICAL  SYSTEMS  238 


(1)  k_c/k_t  Independent  of  nature  of  S 

(2)  k(Ac  nearly  invariant  with  high  energy  S 

(3)  kj/k,  variable  with  low  energy  S 

for  AEc  >  AEt 


Figure  5.  Predictions 


proached.  The  efficiency  of  transfer  to  the  trans  compound  should 
then  begin  to  fall  off.  If  the  energy  deficiency  has  to  be  supplied  as 
a  simple  thermal  activation  energy,  the  rates  of  the  two  transfer  re¬ 
actions  should  fall  off  in  the  same  way,  after  the  sensitizers  have 
become  deficient  with  respect  to  both  excitation  processes.  This 
should  mean  that  the  stationary  states  observed  with  low  energy 
sensitizers  will  again  become  invariant  and  probably  very  rich  in 
the  trans  isomer. 

Unfortunately,  this  naive  theory  turns  out  to  be  woefully  incom¬ 
plete.  Figure  6  shows  data  obtained  in  the  measurement  of  photo¬ 
stationary  states  established  between  the  two  isomeric  stilbenes. 


H  H 


J 

C=C 


cis-stiibene 


E>P  o  57  kcal/mole 


trans-stllbene 
Et  =  49  kcal/xnole 
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Corrected  Plot  for  Photosensitized  Isomerization  of  the  Stilbenes 

i  ip—  |  j  j  i  i  i  i 

£  O*  M«o»ur«d  photoitationory  $tat». 

_»PhotMtotlonory»tat#  predicted  from 
n„  ir  KqmmurtmtnU. 

1.  cyclopropyl  phtnyl  keton* 

2.  oetlophcnont 
3-btnzoph*nsnt 

4.  thloitonthont 

5. anthroqulnont 
til  6.flovon» 

7.  Michltr’t  kttone 

8.  2-nophthyl  phenyl  ketocs 

9.  2-nophtholdehyde 
10. 2-acetonaphthone 


50 

40 


i* 


274 


28#/ 


29  • 


13  '-\10- 

11.  1-naphthyl  phenyl  ketone 

12. chryunt 

13. 1-naphthaldehyde  7, 

14.  biaeetyl 

15.  acetyl  propionyl 

16. tluorenone 

17.  fluoranthene 

18. t,2,5,6-dlbenionthracene 

19. duroqulnone 
ZO.btnzIl 

2 1 . 1,2,3, 4-dlbepzont  h  raeene 

22.  pyrene 

23.  1,2-b«nzonthroe»n» 

24.  beneanthrone 


J  i 


25-  3-oeetyi  pyrene 

26-  acridine 

27-  9,10-dlmethyl- 

1, 2- benzanthracene 

28.  anthracene 

29.  3,4 -benzpyrene 


_L 


-L. 


40  44  48 


52  56  60  64 

EtU  cal  /mole) 


68  72  76 


Figure  6. 


As  is  indicated,  the  excitation  energy  of  rw-stilbene  is  substantially 
higher  than  that  of  the  trans  isomer  so  the  simple  theory  predicts 
that  with  low  energy  sensitizers,  the  photostationary  states  should 
contain  too  little  hvmj-stilbene  to  be  readily  detectable.  Figure  6 
shows  that  the  first  parts  of  the  prediction  are  correct.  High-energy 
sensitizers  do  give  essentially  a  common  result,  and  decreasing  the 
sensitizer  energy  below  that  of  m-stilbene  produces  a  trend  in  the 
m-rich  direction.  However,  when  the  excitation  energy  of  trans- 
stilbene  is  approached  and  passed,  the  trend  is  actually  reversed  and 
mixtures  become  more  fratw-rich.  In  attempting  to  explain  this 
phenomenon  we  have  decided  that  we  must  be  observing  unexpect- 
tedly  high  reactivity  of  cis  stilbene  as  an  energy  acceptor  with  low- 
energy  sensitizers.  Our  explanation  is  that  this  molecule  is  able  to 
pass  directly  to  a  twisted  state  that  has  a  lower  energy  content  than 
any  that  can  be  reached  by  electronic  excitation  while  maintaining 
the  same  geometry  as  is  found  in  the  ground  state  of  the  molecule. 
We  call  this  geometrical  accommodation  of  the  energy  acceptor 
"nonvertical  excitation". 
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We  were  at  first  reluctant  to  invoke  the  concept  of  nonvertical 
excitation  because  it  does  not  occur  during  light  absorption.  This 
is  an  application  of  the  so-called  Franck-Condon  principle.  How¬ 
ever,  there  is  really  no  reason  to  expect  the  Franck-Condon  principle 
to  hold  for  a  process  such  as  energy  transfer.  This  is  actually  a 
simple  form  of  a  bimolecular  reaction  and  does  not  involve  any 
necessary  interaction  with  the  radiation  field.  Having  once  con¬ 
vinced  ourselves  that  we  needed  a  theory,  we  have  since  put  it  to 
use  and  discovered  a  number  of  unexpected  new  photosensitized 
reactions. 

COMMON  PHOTOCHEMICAL  REACTIONS 

Photochemical  reactions  can  be  roughly  grouped  into  the  follow¬ 
ing  five  classes: 

1.  Fragmentation 

2.  Ionization 

3.  Isomerization 

4.  Cycloaddition 

5.  Atom  abstraction 

The  first  three  are  unimolecular  processes,  ir.>  which  an  excited 
molecule  either  breaks  into  fragments  or  undergoes  internal  rear¬ 
rangement.  Reactions  4  and  5  involve  attack  of  an  excited  molecule 
on  some  other  species.  I  will  present  a  few,  nearly  randomly  chosen, 
examples  of  the  various  types. 

Fragmentation  Reactions.  Figure  7  shows  a  number  of  representa- 

CH,COCH,  CHjCO  *  •  CH, 

R-N=N-R  ■■--->  2R>  +N, 


Fjguws  7.  Formation  of  free  radicals 
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tive  fragmentation  reactions.  One  of  the  oldest  known  photochem¬ 
ical  reactions  is  cleavage  to  form  free  radicals.  The  process  has  been 
widely  used  to  effect  controlled  initiation  of  chain  reactions  in  which 
free  radicals  play  a  key  role.  For  example,  polymerization  of  buta¬ 
diene  or  styrene  to  form  high  polymers,  which  are  the  base  of  syn¬ 
thetic  elastomers  and  common  plastic,  can  be  initiated  by  radicals 
formed  by  photolysis  of  azo  compounds  (R— N=N— R)  or  perox¬ 
ides  (R-O-O-R). 

The  last  of  the  radical-producing  reactions  shown  in  Figure  7  has 
an  entirely  different  application.  The  lophyl  radical 


4>'yas*\ 


is  highly  colored  whereas  the  parent,  dimeric  molecule  does  not 
absorb  visible  light.  Irradiation  of  a  material  containing  the  dimer 
with  ultra-violet  light,  which  is  absorbed,  leads  to  photolytic  pro¬ 
duction  of  the  colored  radical.  The  reaction  is  an  example  of  photo - 
chromism,  a  process  in  which  exposure  of  a  material  to  light  causes 
it  to  change  color.  The  lophyl  radicals  normally  combine  to  regen¬ 
erate  the  parent  compound;  so  the  system  is  called  a  reversible 
photochromic  system.  Reversible  photochromism  is  the  subject  of 
intensive  study  in  many  laboratories  at  the  present  time,  because  the 
effects  have  considerable  potential  value  for  information  storage 
and  relay  and  in  development  of  autoprotective  equipment,  such  as 
photochromic  windshields  for  automobiles  and  airplanes.  Another 
interesting  application  that  I  have  heard  proposed  is  the  use  of 
photochromic  inks  that  would  cause  the  girls  in  certain  magazines, 
especially  popular  with  males,  to  blush  when  the  pages  arc  opened 
and  exposed  to  light.  Unfortunately,  the  reversibility  of  the  photo¬ 
chromic  system  is  not  perfect.  The  lophyl  radicals  undergo  reac¬ 
tions  other  than  dimerization,  so  a  few  are  permanently  lost  in  each 
light-dark  cycle. 

Formation  of  Ions.  The  reaction  shown  in  Figure  8  is  another 
kind  of  dissociative  reaction,  in  which  ions  are  produced.  Photolytic 
ionization  seems  to  be  less  common  than  photodissociation  to  form 
radicals;  this  may  he  only  an  artifact  of  the  choice  of  systems  for 
study.  The  example  shown,  photoionization  of  triarylmethane  leu- 
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conitriles,  is  of  some  interest  in  photochromism  because  the  tri- 
arylcarbonium  ions  (Ar3C*)  are  highly  colored. 

Figure  9  shows  another  kind  of  ionization  reaction,  in  which  an 
electron  is  ejected  by  the  photoexcited  species.  The  second  example 


Ar,CCN 


Ar,C+  +  CN* 


Ar  =  (CH,),  N 


or  similar  group 


Figure  8.  Formation  of  ions 


WUrster’s 

Blue 


RH 


he 

- -> 

-1500  A 


R  *  H  *  o’ 


Figure  9.  Ionization. 


is  probably  a  very  general  type  of  reaction.  When  irradiated  with 
light  of  wavelengths  shorter  than  about  1000  A,  most  substances  will 
probably  yield  photoelectrons.  However,  studies  of  this  kind  are 
not  common,  because  light  of  wavelength  shorter  than  2000  A  is 
absorbed  strongly  by  oxygen,  necessitating  the  use  of  high  vacuum 
equipment.  Obviously  “vacuum  ultraviolet**  photochemistry  will 
be  much  more  important  in  the  everyday  life  of  space  travelers  than 
to  earthbound  j»cople.  The  ionization  reaction  shown  at  the  top  of 
Figure  9  occurs  in  solution  under  irradiation  with  near  ultra-violet 
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light.  The  reaction  is  easily  observed  because  the  caton  radical  has 
an  intense  blue  color.  Despite  its  apparnt  simplicity,  photion- 
ization  of  Wiirstcr’s  base  is  rather  complex.  A  quantum  of  3000  A 
light  does  not  have  enough  energy  to  ionize  a  gaseous  molecule  of 
the  base.  Ionization  probably  occurs  in  some  condensed  media  be¬ 
cause  the  electron  and  cation  can  be  stabilized  by  interaction  with 
their  environments.  It  is  also  likely  that  ionization  results  from  the 
interaction  of  pairs  of  excited  molecules,  which  results  in  concentra¬ 
tion  of  all  the  excitation  energy  in  one  molecule. 

Isomerization  Reactions.  Figure  10  shows  a  couple  of  interesting 

db 


isomerization  reactions.  The  first  is  of  interest  because  the  photo¬ 
product  has  a  much  higher  energy  content  than  the  starting  material. 
Furthermore,  the  reaction  can  he  carried  ont  using  sensitizers  that 
absorb  visible  and  near  ultraviolet  light.  If  we  had  a  smooth  way 
to  catalyze  the  thermal  reversion  reaction,  the  system  might  he  of 
some  interest  for  storage  of  solar  energy.  However.  I  fee!  that  the 
time  is  not  yet  ripe  to  fasten  on  a  particular  euergy-storiug  system. 
Many  others  will  be  discovered  as  photochemistry  forges  ahead,  and 
in  another  ten  years  the  number  of  potential  energy  storage  and 
release  systems  may  be  increased  tenfold,  Hie  second  reaction  shown 
in  Figure  10  is  another  photoeluomic  system  that  has  received  a 
good  deal  of  attention  recently.  The  colorless  form  is  a  member  of  a 
class  of  materials  known  as  spiropyrans,  The  colored  isomers  can 
also  be  produced  by  heat,  so  the  materials  are  thermocluomte  as 
welt  as  photoduromic. 
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Cycloaddition  Reactions.  Figure  1  i  shows  a  couple  of  examples 
of  photoinduced  cycloaddition  reactions.  The  first  example,  dimeriza¬ 
tion  of  cyclopcntcnanc,  occurs  under  direct  irradiation  and  by  the 
sensitized  technique.  The  sensitized  reactions  are  believed  to  involve 
triplets;  consequently,  the  fact  that  identical  mixtures  of  the  two 
products  arc  formed  in  the  direct  and  sensitized  reactions  implies 
that  intersystem  crossing  occurs  before  the  chemical  action  when 
light  is  absorbed  directly.  However,  the  relative  amounts  of  the  two 
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products  can  In*  altered  by  changing  the  solvent  in  which  the  reac¬ 
tion  is  carried  out.  This  simple  mechanism  for  control  of  the  course 
of  a  photoreaetien  was  somewhat  surprising  to  photocitemists,  al¬ 
though  the  phenomenon  is  very  familiar  from  study  of  the  dynamics 
of  ordinary  thermal  reactions.  There  has  existed  a  kind  of  mystique 
that  gees  something  as  follows:  The  specific  rate  constants  for  rear- 
dons  of  excited  states  are  very  large*  and  fast  reactions  are  often 
both  unselective  and  relatively  insensitive  to  environmental  influ¬ 
ences.  Other  examples  of  solvent  effects  on  the  course  of  pbotore- 
actions  have  come  to  light  within  the  last  fees*  years. 

Dimerization  of  butadiene,  the  second  reaction  shown  in  Figure 
12,  is  an  example  of  a  reaction  that  cannot  be  effected  by  direct 
excitation  but  goes  smoothly  using  a  variety  of  photosensitizes*  as 

*  Tint  mitrt  fie  SO  to  a  flow  gbeuweat  mtltM  to  tetupOc  m*cvw4uit*  tottfi 
decay  of  the  excited  Mate  fiaeh  to  the  gtoutw)  date. 


240  JOURNEYS  IN  SCIENCE 


JR.  - *  R—8 

fttvwti  12.  Amur  Abstraetmm 

primary  light  absorbers.  Furthermore,  the  relative  amounts  a!  the 
three  produets  tan  he  changed  by  varying  the  sermtiter.  Tbit  fanT 
(rating  effect  is  attributed  to  the  msteuee  ol  two,  stereoitonte«u- 
triplet  states  of  opeu-ehain  dieuev 
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The  general  lesson  to  be  learned  from  tint  example  is  the  rreeevnty 
ol  reengniilng  that  the  geometry  ol  an  pvtitetl  state  may  be  ugrrifu 

candy  different  item  that  ol  tire  same  molecule  in  its  ground  state. 
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Atom  Abstraction.  The  last  type  of  reaction,  atom  abstraction, 
is  illustrated  in  Figure  12.  The  name  arises  from  a  common  mechan¬ 
istic  step,  and  not  from  the  overall  course  of  the  reaction.  The  net 
reaction  is  usually  an  addition,  but  two  steps  are  involved.  The  key 
is  abstraction  of  an  atom  from  some  molecule  by  an  excited  species. 
The  first  step  produces  free  radicals  and  the  final  products  are 
formed  by  coupling  of  the  radicals.  Aldehydes  and  ketones,  which 
contain  carlwmyl  groups  (jC  =  O),  are  most  frequently  the  excited 
species  and  the  atom  abstracted  is  usually  hydrogen.  The  reaction  is 
probably  the  commonest  method  for  photochemical  production  of 
free  radicals  in  condensed  media  and  is  probably  iatgely  resjionsible 
for  initiation  oi  'he  oxidative  reaction  that  leads  to  degradation  of 
films  and  fibers  under  ordinary  use'  conditions. 

I  have  a  feeling  that  the  last  part  of  this  presentation  may  seem 
encyclu|K'dic.  However,  if  we  could  nut  claim  some  insight  into  the 
intimate  details  of  photoreactions,  we  would  have  no  grounds  for 
hoping  that  controls  necessary  for  future  use  of  photoreactions  as 
gears  in  chemically-based  systems  could  be  devised. 
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XI.  Some  Properties  of  the 
Liquid  State 

Stuart  A.  Rice 

A.  Linear  Transport  Phenomena  in  Simple  Liquids 

1.  INTRODUCTION 

If  a  temperature  gradient  is  maintained  across  a  sample  of  some 
substance,  there  is  established  a  steady  state  heat  flow  corresponding 
to  the  transport  of  energy  from  the  hotter  to  the  colder  side.  This  is 
but  one  of  many  possible  transport  phenomena  which  can  occur. 
Simple  models  descriptive  of  transport  phenomena  in  a  dilute  gas 
have  existed  for  about  one  hundred  years,  and  a  complete  kinetic 
theory  based  on  the  Boltzmann  collision  equation  for  about  fifty 
years.1  In  contrast,  serious  attempts  to  construct  a  kinetic  theory  ol 
liquids  started  only  twenty  years  ago.2  The  reason  for  a  lag  in  the 
development  of  a  theory  of  liquids  is  easy  to  find.  In  a  dilute  gas, 
molecules  move  along  linear  trajectories  which  arc  infrequently 
interrupted  by  binary  collisions.  Thus,  the  dynamical  evolution  of 
the  state  of  the  g;.s  may  be  described  in  terms  of  the  projierties  of 
successive  uncortelatetf  binary  collisions,  and  since  the  two  hotly 
problem  is  easily  solved  there  is  no  difficulty  in  representing  the 
steady  state  linear  ttanspoit  phenomena  in  terms  of  the  imennolec- 
uiar  potential  and  other  molecular  properties.  Now,  in  a  liquid  or 
dense  gas  everv  molecule  is  in  continuous  interaction  with  a  large 
number  of  near  neighbors.  A  consequence  of  this  state  of  simul- 
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taneous  interaction  is  the  disappearance  of  the  binary  collision  as  a 
uniquely  defined  dynamical  event.  Because  we  cannot  use  isolated 
binary  collisions  as  a  basis  for  dynamical  calculations,  and  because 
of  the  strong  correlations  between  molecular  positions,  the  descrip¬ 
tion  of  the  non-equilibrium  steady  state  in  a  liquid  must  employ  a 
more  sophisticated  analysis  of  N-body  dynamics  than  is  necessary 
in  the  description  of  the  properties  of  a  gas.  Indeed,  it  will  be 
necessary  to  examine  in  a  fundamental  way  the  nature  of  irrever¬ 
sibility  in  an  N  molecule  system,  and  to  learn  how  the  description 
of  irreversibility  is  related  to  N-body  dynamics. 

The  starting  point  for  the  discussion  of  transport  phenomena  is 
the  description  of  the  macroscopic  dissipative  processes  in  terms  of 
the  constraints  which  define  the  nonequilibrium  state  of  the  system, 
and  a  set  of  coefficients  which  measure  the  rate  of  dissipation.  Dis¬ 
sipative  processes  arise  from  the  transport  of  mass,  momentum  and 
energy.  In  each  case  there  exists  a  phenomenological  relationship 
between  a  flux  and  the  force  which  is  responsible  for  the  flux.  In  the 
cases  of  energy  and  mass  transport  we  have  the  Fourier  and  Fick 
equations. 


q=*VT, 

I^=kV-T. 

Jb,=  DVC. 

-I^DV^C.  (1) 

with  q  and  Jm  the  energy  and  mass  fluxes,  tea  ml  D  the  coefficients 
of  thermal  conductivity  and  diffusion,  T  the  temperature,  and  t  the 
concentration  of  one  of  the  two  components  in  the  medium  wherein 
diffusion  is  occurring.  In  the  case  of  momentum  transport  the  stress 
tensor  (T  and  the  rate  of  strain  e  play  primary  roles.  For  a  Newtonian 
fluid  the  principal  shearing  stresses  arc  propot  tional  to  the  cor¬ 
responding  rates  of  strain  and 


2/3,)Vti]  . +M.  (2) 


wills  and  y  die  coefficients  of  diiaiational  and  shear  viscosity,  u 
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the  fluid  velocity,  p  the  pressure,  and  1  the  unit  tensor.  The  stress 
law  (2)  when  introduced  in  the  equation  of  motion  of  the  fluid 
leads  to  the  Navier-Stokes  equation— the  starting  point  for  the  study 
of  fluid  dynamics. 

For  the  simple  fluids  considered  in  this  lecture,  Eqs.  (1)  and  (2) 
provide  an  accurate  representation  of  dissipative  behavior.  The 
coefficients  k,  D,  j ?  and  <f>  may  be  determined  experimentally  by  a 
variety  of  methods  based  upon  suitable  solution  of  the  appropriate 
differential  equation.  It  is  found  that  all  of  the  transport  coeffici¬ 
ents  vary  when  the  temperature  and  density  of  the  liquid  are  varied. 
It  is  observed  that,  at  constant  external  pressure,  D  increases  ex¬ 
ponentially  and  tj  decreases  exponentially  as  T  is  increased.  Under 
the  same  conditions  *  is  much  less  sensitive  to  changes  in  tempera¬ 
ture  than  are  -q  and  D.  For  most  simple  liquids  D  is  of  the  order  of 
10B  cm2/sec,  K  of  the  order  of  104  cal/cm  sec  °C,  and  -q  of  the  order 
of  5  X  10‘3  sec/cm.2  The  dilatational  viscosity,  is  partially 
responsible  for  the  attenuation  of  sound  in  a  liquid.  In  simple 
liquids,  e.g.  Ar,  N.2,  $  is  of  the  same  order  of  magnitude  as  -q. 

Suppose  that,  for  some  liquid,  the  several  transport  coefficients 
have  been  determined  as  a  function  of  temperature  and  density. 
How  can  these  data  be  interpreted  in  terms  of  molecular  dynamics 
and  the  structure  of  the  liqu.  The  extant  theories  of  transport 
phenomena,  which  deal  with  just  such  an  analysis,  may  be  con¬ 
veniently  grouped  into  four  classes:  (a)  simple  quasi-solid  or  quasi¬ 
gas  models  with  many  empirical  parameters,  (b)  phenomenological 
analyses  based  upon  the  principle  of  corresponding  states,  (c)  statisti¬ 
cal  mechanical  theories  starting  from  the  rigorous  Liouville  equation 
but  employing  simplifying  approximations,  and  (d)  developments 
which  lead  to  formally  exact  results,  but  which  may  be  difficult  to 
use  for  a  practical  calculation.3  In  the  following  we  examine  the 
Rice-Ailnatt  theory  which  is  an  example  of  class  (c).  Mathematical 
details  of  the  derivations  are  available  in  the  literature4  and  will  not 
be  repeated  herein.  We  shall  focus  attention  exclusively  on  the  na¬ 
ture  of  the  physical  arguments,  the  validity  of  the  approximations, 
and  the  implications  of  the  theory  in  other  contexts.  Wherever  pos¬ 
sible  we  shall  also  examine  the  agreement  between  theory  and  ex¬ 
periment,  and  the  relationship  between  the  Rice-Allnatt  theory  anti 
the  approaches  of  class  (d). 
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2.  GENERAL  COMMENTS 

The  development  of  a  statistical  theory  of  transport  phenomena  in 
liquids  requires  consideration  of  many  problems,  among  which  are: 

(a)  Analysis  of  the  means  by  which  the  time  reversible  equations 
of  classical  and  quantum  mechanics,  which  are  used  to  describe  the 
motions  of  molecules,  lead  to  the  time  irreversible  flux  equations 
displayed  in  Equation  (1), 

(b)  Derivation  of  a  suitable  kinetic  equation  determining  the 
time  evolution  and  phase  dependence  of  some  ensemble  probability 
distribution, 

(c)  Solution  of  the  kinetic  equation  to  obtain  relationships  be¬ 
tween  the  macroscopic  parameters  r),  <f>,  k  and  D  and  the  intermolec- 
ular  potential,  number  density,  temperature,  etc. 

The  reader  should  recognize  that  the  calculation  of  transport 
coefficients  is,  in  fact,  only  a  small  part  of  the  general  problem  of 
describing  time  dependent  phenomena.  Namely,  it  is  concerned 
with  that  state  of  a  fluid  in  which  all  time  dependence  resides  in 
the  local  hydrodynamic  flow  velocity.  The  general  problem  also  in¬ 
volves  the  description  of  those  short  lived  processes  whose  time  de¬ 
pendence  is  explicit.  Such  processes  generally  depend  on  the  nature 
of  the  initial  state,  the  history  of  the  evolution  of  the  system  and 
other  factors.  They  are  called  non-Markovian  processes  because  of 
the  dependence  on  history,  etc.  In  the  hydrodynamic  regime  the 
processes  involved  depend  only  on  the  instantaneous  state  of  the 
system,  and  arc  thereby  classified  as  Markovian.  The  asymptotic  ap¬ 
proach  of  the  exact  kinetic  equations  describing  non-Markovian 
procsses  to  the  Markovian  equations  of  the  hydrodynamic  regime, 
with  which  we  are  concerned  in  this  section,  is  discussed  later. 

The  exact  kinetic  equations  for  a  dense  fluid  can  be  displayed  only 
in  the  most  formal  way  at  the  present  time.  Consequently,  their 
asymptotic  Markovian  form  is  unknown,  and  the  forms  of  the  equa¬ 
tions  derived  to  describe  a  dense  fluid  are  based  on  a  set  of  approx¬ 
imations  which  simultaneously  define  both  an  intuitive  analysis  of 
the  nature  of  random  processes  and  a  simple  physical  description 
of  rhe  fundamental  dynamical  processes  influencing  the  evolution 
of  the  state  of  the  liquid. 

l  ire  method  of  obtaining  equations  satisfied  by  the  one-  and  two- 
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molecule  distribution  functions  (ft;  t),  f<2>  (ft;  t),  respectively, 
is  essentially  that  of  integrating  the  N-moIecule  distribution  func¬ 
tion  f<N>  (ft}!  t)  over  the  sub-phase  space  of  all  the  other  molecules  in 
the  system.  Now,  f<N>  satisfies  the  Liouvile  equation,  and  is  not 
known  explicitly.  Therefore,  one  may  only  obtain  differential 
equations  for  £(1>  and  f<2>  by  integrating  the  Liouville  equation  term 
by  term.  The  result  is  a  coupled  hierarchy  of  equations;  i.e.,  the 
equation  for  f'1’  also  involves  f<2\  the  equation  for  f<2>  also  involves 
f<3\  and  so  on.  It  is  necessary  to  truncate  or  otherwise  rearrange  this 
hierarchy  at  some  point  in  order  to  obtain  closed  equations  for  ff1’ 
and  f<2>. 

For  a  classical  fluid  we  describe  the  system  of  N  structureless 
molecules  in  the  volume  V  by  use  of  the  Hamiltonian  equations  of 
motion.  These  equations  have  some  interesting  general  implications. 
Since  there  is  one  equation  for  each  degree  of  freedom  of  the  system, 
it  fol.’ows  that  the  phase  of  the  system  at  any  instant  is  uniquely 
determined  by  the  phase  at  any  other  instant.  In  accordance  with 
the  definition  of  a  Markov  random  process,  it  follows  that  the  phase 
of  the  system,  ft-,  may  be  regarded  as  a  Markov  process  of  a  simple 
kind.  (The  transition  probability  is  a  8-function,  since  the  increment 
of  the  variable  ft,  has  only  orre  possible  value  for  each  time  instant.) 
The  kinetic  equations  for  the  reduced  distribution  functions  f(1>, 
f<2>,  ....  are  concerned  with  the  random  variables  ft  (1),  ft,  (1,2), 
....  which  are  of  smaller  dimensionality.  Now,  it  is  well  known 
that  the  projection  of  a  Markov  process  of  6  N  dimensions  onto  a 
space  of  smaller  dimensionality  (6,  12,  ...  ,  dimensions)  generally 
yields  a  random  process  of  higher  order.  Thus,  ft  (1),  ft  (1,2),  .  .  . 
will  be  non-Markovian  processes  of  higher  order.  This  general  fea¬ 
ture  has  been  obtained  in  the  analysis  of  Prigogine  and  co-work¬ 
ers.6  They  find  that  the  stochastic  interaction  term  has  the  form  of  a 
time-convolution  over  the  history  of  the  variable.  The  important 
result  is  that  when  the  system  has  reached  a  stationary  state,  the 
kinetic  equations  reduce  to  Markovian  form. 

The  problem  of  analyzing  further  the  coupled  hierarchy  of  kinetic 
equations  has,  therefore,  two  distinct  features.  Since  an  integration 
over  the  sub-phase  space  of  (N-l)  or  (N-2)  molecules  leaves  the  equa¬ 
tions  completely  reversible,  the  analysis  used  must,  in  the  first 
place,  single  out  the  features  (e.g.  time  scale)  which  make  the  equa¬ 
tions  irreversible.  Secondly,  it  must  provide  a  means  of  picking  out 
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the  Markovian  features  that  the  kinetic  equations  contain  in  the 
hydrodynamic  regime.  The  introduction  of  irreversibility  is  not 
difficult;  it  is  merely  contingent  upon  the  particular  method  by 
which  the  Markovian  feature  of  the  description  is  achieved.  For 
some  particularly  simple  systems,  for  example,  a  heavy  particle  in  a 
linear  chain  of  harmonically  coupled  particles,0  the  equations  of 
motion  may  be  solved  exactly.  It  is  then  found  that  there  is  a  time 
scale  on  which  the  system  exhibits  irreversible  behaviour  in  the 
sense  that  the  velocity  autocorrelation  function  of  the  heavy  particle 
tends  to  zero,  etc.  For  a  sufficiently  long  time,  recurrences  of  the 
mechanical  state  occur,  and  complete  reversibility  is  observed. 
Whether  or  not  irreversible  behaviour  is  observed  is  then  linked 
to  the  time  scale  on  which  the  system  is  observed.  The  level  of  detail 
with  which  the  dynamics  is  described  is  also  of  importance.  For, 
mechanical  quantities  which  depend  on  the  positions  and  momenta 
of  all  N  particles  in  the  system  do  not  exhibit  irreversible  behav¬ 
iour,  even  in  the  sense  described.  Of  course,  almost  all  macroscopic 
properties  of  a  system  depend  on  the  averaged  behaviour  of  only  a 
small  number  of  molecules,  say  pairs,  triplets,  ....  It  is  these 
quantities  which  behave  as  described. 

Unfortunately,  there  are  very  few  systems  for  which  the  N-particle 
dynamics  can  be  analyzed  exactly.  In  addition,  at  present  no  system¬ 
atic  and  analytic  procedure  for  determining  the  quasi-Markovian 
behaviour  of  an  evolving  system  is  known.  Thus,  it  is  necessary  to 
adopt  approximate  methods  to  extract  from  the  N-body  dynamics 
the  features  desired.  It  must  be  understood  that  these  approximate 
procedures  do  not  “introduce”  irreversibility  where  a  complete  and 
correct  analysis  would  not.  Rather,  the  approximations  used  are 
intended  to  provide  adequate  solutions  to  the  N-body  problem  in 
some  time  or  space  domain.  The  one  we  shall  adopt  is  based  on  that 
first  proposed  by  Kirkwood.2 

Consider  now  the  relationship  between  non-Markovian  processes 
in  the  subphase  spaces  of  one,  two,  .  .  .  molecules,  and  the  ul¬ 
timate  transition  to  a  Markovian  kinetic  equation  defined  on  these 
same  subphase  spaces.  We  wish  to  assert  that  an  nth  order  process 
can  be  treated  as  an  n-dimensional  Markov  process,  the  reduction 
being  accomplished  by  grouping  the  states  of  the  process  into 
hyperstates.  Each  hyperstate  in  the  Markov  process  contains  informa¬ 
tion  about  the  history  of  the  system  during  the  interval  tm  to  tTO  „.t. 
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Much  of  this  information  is  superfluous  for  the  evaluation  of  the 
distribution  functions  in  the  hydrodynamic  regime,  but  the  informa¬ 
tion  needed  is  contained  within  the  hyperstate.  The  method  of  reduc¬ 
ing  the  hierarchy  of  coupled  equations  for  the  distribution  functions 
is  therefore,  a  means  of  extracting  the  relevant  information  from 
the  hyperstate.  The  particular  contribution  to  the  theory  made  by 
Kirkwood,  which  we  have  already  mentioned,  is  the  hypothesis  that 
the  relevant  information  for  present  purposes  is  contained  in  the 
exact  distribution  function  averaged  over  an  interval  of  time  r.  The 
average  value  for  an  interval  r  on  the  fine-grained  time  scale  is  then 
associated  with  a  single  point  on  a  coarse-grained  time  scale,  and  the 
process  is  known  as  coarse  graining  (in  time*).  The  kinetic  equa¬ 
tions  obtained  in  this  way  are,  in  principle,  difference  equations,  but 
it  turns  out  that  the  times  during  which  changes  become  significant 
on  a  hydrodynamic  scale  are  so  long  compared  to  the  coarse-grain- 
ing  time  that  no  significant  error  is  introduced  by  treating  the 
differences  as  differentials. 

The  introduction  of  irreversibility  which  must  accompany  the 
coarse  graining  is  accomplished  by  the  assumption  that  a  time  inter¬ 
val  r  exists  such  that  the  dynamical  behavior  of  the  system  during 
one  interval  is  related  in  a  simple  statistical  manner  to  the  dynami¬ 
cal  events  of  the  previous  interval.  It  may  be  shown  that  the  statis¬ 
tical  character  of  the  relation  is  sufficient  to  render  the  process 
irreversible. 

The  statistical  assumption,  or  ansatz,  can  be  analyzed  on  the  basis 
of  an  intuitive  picture  of  the  dynamics  of  liquid  molecules.  Consider 
first  the  Fokker-Planck  equation  describing  the  behavior  of  a 
Brownian  particle.4  This  equation  describes  a  stochastic  process 
under  conditions  such  that  the  transition  probability  (for  the  phase 
rB  of  the  Brownian  particle)  is  that  for  a  stationary  Markov  process. 
In  turn,  this  may  be  shown  to  be  the  result  of  allowing  the  time 
resolution  of  the  description  of  the  Brownian  particle  to  be  suffi¬ 
ciently  coarse  that  transient  behavior  associated  with  the  appioach 
to  local  equilibrium  in  the  molecular  motions  cannot  be  resolved. 
Thus,  the  description  of  Brownian  motion  as  a  Markov  process 
applies  only  to  the  discussion  of  processes  taking  place  on  a  time 

*  It  is  possible  also  to  coarse  grain  in  space;  similar  kinetic  equations  arc  ob¬ 
tained. 
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scale  longer  than  some  tc  characteristic  of  the  dynamical  behavior 
of  the  liquid  molecules.  In  the  development  of  the  theory  tc  is 
chosen  using  physical  criteria  such  that  the  basic  dynamical  event 
(in  this  case  molecular  fluctuations)  is  statistically  independent  of 
prior  events.  Were  this  not  the  case,  the  transition  probability  con¬ 
necting  two  dynamical  states  of  the  Brownian  particle  would  not  be 
Markovian. 

The  problem  of  Brownian  motion  is  concerned  with  numerous 
small  momentum  transfers,  or  numerous  small  particle  displace¬ 
ments.  At  the  other  extreme  of  behavior,  where  momentum  trans¬ 
fers  may  frequently  be  large  and  where  displacements  may  be  large, 
is  the  dilute  gas.  Transport  phenomena  in  a  dilute  gas  are  usually 
described  by  a  one  molecule  distribution  function,  which  satisfies  a 
kinetic  equation  (Boltzmann  equation)  in  which  the  effects  of  molec¬ 
ular  interaction  appear  in  the  form  of  isolated  binary  collisions.1 
The  rate  of  change  of  the  distribution  function  is  determined  by 
the  slow  secular  variations  of  f(V  due  to  streaming  in  phase  space,  on 
which  are  superimposed  the  effects  of  the  binary  collisions.  On  the 
average,  a  molcule  moves  a  long  distance  (relative  to  its  size  or  the 
range  of  the  intermolecular  forces)  before  undergoing  an  encounter. 
Although  there  is  a  large  volume  of  phase  space  wherein  there  occur 
small  angle  deflections  resulting  from  binary  collisions,  large  angle 
deflections  are  also  frequent.  Indeed,  large  angle  deflections  are 
responsible  for  most  of  the  transport  of  energy  and  momentum  due 
to  collisions.  From  the  numerous  studies  of  the  derivation  of  the 
Boltzmann  equation  from  the  first  principles  of  statistical  mechanics 
it  is  found  that  the  assumptions  required  to  effect  a  derivation  are 
basically  three  in  number:  the  neglect  of  interactions  of  higher 
order  than  binary  collisions,  the  condition  of  molecular  chaos  (i.e., 
the  condition  that  every  pair  of  colliding  molecules  is  statistically 
independent  prior  to  the  collision),  and  the  slow  secular  variation  of 
ftv  in  space.  Of  these  conditions,  only  the  molecular  chaos  is  respon¬ 
sible  for  the  irreversibility. 

Of  course,  in  a  liquid  both  small  and  large  momentum  transfers 
occur.  How  can  we  describe  the  properties  of  this  system?  The  mean¬ 
ing  of  molecular  chaos  in  a  dilute  gas  is  that  molecule  2  (which  is 
due  to  collide  with  molecule  1)  has  approached  molecule  1  from  in¬ 
finity  and  its  distribution  of  possible  velocities  has  not  been  affcc'  .1 
by  collisions  with  molecules  which  have  recently  collided  widi  mole- 
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cule  1.  In  a  dense  fluid,  molecule  1  may  undergo  a  rigid  core,  i.e., 
strongly  repulsive,  collision  with  a  second  molecule  which  has  for 
some  time  past  been  in  the  region  of  the  first  coordination  shell  of 
molecule  1.  Thus,  molecule  2  should  already  have  an  intimate  statis¬ 
tical  "knowledge”  of  molecule  1,  and  may  indeed  have  undergone  a 
rigid  core  collision  with  molecule  1  in  the  immediate  past.  However, 
if  the  quasi-Brownian  motion  produced  in  the  molecules  by  the  van 
der  Waals  part  of  the  forces  is  sufficiently  effective  in  causing  mole¬ 
cule  2  to  forget  its  previous  experience,  then  successive  rigid  core 
collisions  should  satisfy  the  simple  form  of  molecular  chaos  used. 

It  is  possible  to  formulate  a  consistency  condition  on  the  passage 
from  the  non-Markovian  to  the  Markovian  description  of  the  fluid. 
In  a  sense,  the  distribution  function  may  be  thought  of  as  a  vector 
in  a  continuous  space  whose  components  represent  the  occupation 
probabilities  of  the  various  states  of  the  phase  space.  In  the  most 
general  case,  the  probability  of  finding  the  set  of  states  (p(n),  {n}) 
depends  on  the  past  history  of  the  system.  There  is,  however,  a 
limiting  case  for  which  the  past  can  be  ignored:8 

If,  no  matter  what  the  sequence  (p<">,  {n},  t,)(  p(n),{n},  t2),  .  .  .  , 
is,  we  always  end  up  with  the  same  assignment  of  probabilities  for 
being  in  each  of  the  states  of  (p(n>,  {n},  t),  then  the  preceding  se¬ 
quence  can  have  no  influence  on  the  transition. 

This  condition  is  used  as  follows:  Let  it  be  assumed  that  there 
exists  a  time  interval  T  such  that  the  following  dynamical  event, 
defined  in  r,  defines  a  Markov  process.  The  dynamical  event  con¬ 
sists  of  a  strongly  repulsive  binary  encounter  followed  by  a  quasi- 
Brownian  motion  of  the  pair  of  molecules  in  the  fluctuating  field 
of  all  the  neighboring  molecules.  Because  the  destruction  of  correla¬ 
tions  by  the  quasi-Brownian  motion  is  efficient,  successive  strongly 
repulsive  encounters  are  statistically  independent.  The  compound 
dynamical  event  is,  therefore,  asserted  to  be  independent  of  prior 
events  of  the  same  kind.*f 

*  Recent  studies  of  neutron  diffraction  from  liquid  Ar  confirm  the  accuracy 
of  this  hypothesis  B.  A.  Uasamacharya  and  K.  R.  Rao,  Pltvs.  Rev.  137,  A417 
<1%5). 

tThe  dynamical  events  are.  of  course,  the  interaction  of  the  molecule,  pair  of 
molecules,  etc.,  under  consideration,  with  their  environment.  Clearly,  the  phase 
of  the  molecule,  pair  etc.,  is  not  independent  of  the  phase  during  a  previous 
interval;  it  is  the  phase  of  the  environment  which  is  (assumed)  iudepeudent  of 
the  phase  (luring  a  previous  iuterval. 
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If,  no  matter  what  the  sequence  (p<n>,  {n},  tj),  p<n\{n},  t2),  ...» 
is,  we  always  end  up  with  the  same  assignment  of  probabilities  for 
being  in  each  of  the  states  of  (p<">,  {n},  t),  then  it  is  necessary  that 
the  relaxation  time  for  return  to  the  states  of  (p<n>,  {n},  t)  be  short 
relative  to  the  time  interval  on  which  the  fundamental  dynamical 
event  is  defined.  Thus,  if  it  can  be  shown  that  the  relaxation  time 
for  the  return  to  local  equilibrium  is  much  shorter  than  the  time 
between  strongly  repulsive  binary  encounters,  the  initiation  of  the 
dynamical  event  consisting  of  a  strongly  repulsive  binary  encounter 
followed  by  a  quasi-Brownian  motion  always  starts  from  the  same 
distribution  function.  In  this  case  the  probabilities  for  being  in 
each  of  the  states  of  (p<“>,  {n},  t)  just  define  the  distribution  func¬ 
tion,  and  the  required  condition  is  satisfied. 

A  very  interesting  and  fundamental  analysis  of  the  role  of  co¬ 
herence  time  in  the  statistical  mechanics  of  irreversible  processes 
has  been  given  by  Fano®  using  some  ideas  and  techniques  introduced 
by  Zwanzig.10  Fano  shows  that  in  the  limit  that  the  dynamical  co¬ 
herence  between  a  subsystem  and  its  surroundings  (reservoir)  is 
short  lived,  the  effective  interaction  between  reservoir  and  system  is 
weak  irrespective  of  the  magnitude  of  the  intermolccular  potential. 
From  Fano’s  analysis  Hurt  and  Rice11  have  developed  a  formal  co¬ 
herence  time  expansion  for  the  classical  fluid,  and  show  that: 

(a)  In  the  limit  of  short  memory  of  dynamical  coherence,  the » 
Ricc-AUnatt  kinetic  equations  (see  following)  are  a  valid  description 
of  steady  state  phenomena  in  the  liquid. 

(b)  Despite  the  fact  that  the  usual  expansion  parameters  p«r3  or 
C/kT  arc  not  useful  in  the  liquid,  there  does  exist  a  qualitatively 
different  expansion  parameter,  rjr  where  re  is  the  lifetime  of  dy¬ 
namical  correlations  and  r  is  the  time  between  dynamical  events. 
The  new  parameter  appears  naturally  because,  when  the  surround¬ 
ing  medium  has  the  projrerty  of  propagating  away  or  otherwise 
destroying  dynamical  correlations  in  the  subsystem  of  interest,  it  is 
not  [tertinent  to  measure  the  strength  of  the  interaction  in  terms 
relating  to  the  spacing  of  the  continuous  spectrum  of  the  Liouvilie 
operator  of  the  surrounding  medium.  All  that  is  pertinent  in  this 
case  is  the  lifetime  of  dynamical  correlations.  For  the  case  of  ? 
perturbation  in  momentum  sjtace,  Rice  and  Alinatt  have  shown4 
that  the  lifetime  of  the  dynamical  correlation  is  an  order  of  magm 
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tude  less  than  the  time  between  dynamical  events,  thus  justifying 
truncation  of  the  coherence  time  expansion  after  terms  in  re/r. 

(c)  The  fundamental  hypothesis  of  time  smoothing  is  a  natural 
expression  of  the  nature  of  the  coherence  time  expansion. 

We  have  already  mentioned  the  Rice-Allnatt  kinetic  equations 
(see  (a)  and  (b)).  These  were  developed  before  the  derivation  of  the 
coherence  time  expansion,  using  physical  arguments  with  content 
substantially  identical  to  the  formal  results  of  tire  coherence  time 
analysis.  For  simplicity,  we  shall  discuss  the  theory  in  intuitive 
terms. 

The  theory  of  irreversible  phenomena  in  liquids  developed  by 
Rice  and  Allnatt  was,  in  the  first  instance,  relevant  to  a  model 
monoatomic  dense  fluid  in  which  the  intermolccular  potential  has 
the  form  of  a  rigid  core  repulsion  superimposed  on  an  arbitrary 
soft  potential.  Subsequent  analysis  has  shown  that  the  extension  of 
die  model  to  include  more  realistic  potentials  presents  no  formal 
difficulty,  provided  that  die  repulsive  potential  is  sufficiently  short 
ranged. 

What  advantage  results  from  separating  the  intermolccular  po¬ 
tential  into  two  parts  and  treating  their  effects  separately?  Quite 
simply,  the  difference  in  range  and  strength  of  the  repulsive  core 
and  the  soft  potential  allows  the  discussion  of  the  molecular  motion 
in  terms  of  two  time  scales:  one  corresponds  to  the  large  momentum 
and  energy  transfers  which  occur  during  a  strongly  repulsive  en¬ 
counter,  while  the  other  corresponds  to  the  frequent  small  momen¬ 
tum  and  energy  transfers  which  occur  during  the  quasi-Brown ian 
motion  of  a  molecule  in  the  superimposed  soft  force  field  of  all 
the  molecules  in  its  surroundings.  The  short  range  of  the  strongly 
repulsive  core  implies  that  the  first  class  of  encounters  are  of  short 
duration,  so  that  the  probability  that  a  molecule  undergoes  such 
encounters  with  two  or  more  others  simultaneously  is  sufficiently 
small  that  it  may  be  neglected.  The  introduction  of  the  idcuii/cd 
rigid  core  representation  for  this  class  of  encounters  may  thus  be 
regarded  as  a  formal  device  for  restricting  consideration  to  binary 
encounters  (i.e.,  rigid  core  encounters  between  not  more  than  two 
molecules).  It  has  the  additional  advantage  of  considerably  simplify¬ 
ing  the  mathematical  details  of  the  solutions  of  the  equations  but, 
we  believe,  without  significantly  affecting  the  numerical  results. 

The  Markovian  property  of  the  kinetic  equations  is  introduced 
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into  the  analysis  by  the  use  of  the  Kirkwood  hypothesis  that  a  time 
interval  r  exists  such  that  the  dynamical  events  occurring  in  one  in¬ 
terval  are  independent  of  those  in  the  preceding  intervals.  The 
dynamical  event  is  identified  as  a  rigid  core  encounter  followed  by 
erratic  or  quasi-Brownian  motion  in  the  fluctuating  soft  force  field 
of  the  neighboring  molecules.  This  identification  is  contingent  upon 
die  cfiectveness  of  the  quasi-Brownian  motion  at  causing  the  en¬ 
vironment  to  forget  the  momentum  with  which  a  molecule  was 
rebounded  after  the  rigid  core  encounter.  This  in  turn  implies  that 
the  relaxation  time  for  the  equilibrium  of  the  momentum  due  to 
the  soft  force  alone  is  much  shorter  than  that  due  to  rigid  core 
encounters  alone.  It  may  be  shown  that  this  physical  statement  is 
supported  by  detailed  calculation  of  the  appropriate  relaxation 
times  for  the  motion  considered. 

The  introduction  of  irreversibility  in  the  manner  described  leads 
to  a  set  of  integrodifferential  equations  describing  the  evolution  of 
die  coarse-grained  singlet  /<*>  (1),  doublet  /<2>(1,2),  etc.,  distribution 
functions.  Details  of  the  derivations  may  be  found  elsewhere.4  Here 
we  merely  state  that  each  of  the  kinetic  equations  involves,  in  the 
interaction  term,  a  sum  of  repulsive  short  ranged  and  weak  longer 
ranged  scattering  operators.  For  the  singlet  distribution  function 
the  short  ranged  scattering  term  is  similar  in  form  to  the  Enskog 
kernel  in  the  kinetic  theory  of  the  rigid  sphere  fluid.*  The  weak 
interaction  scattering  is  described  by  a  weak  coupling  master  opera¬ 
tor  which,  when  the  friction  is  independent  of  momentum,  reduces 
to  a  Fokker-Planck  ojrcrator.  Rice  and  Allitatt  use  the  form  involv¬ 
ing  the  Fokker-Planck  operator  because  the  more  general  kinetic 
equation  is  so  complex  that  solutions  are  difficult  to  obtain.  It 
should  be  noted  that  the  weak  coupling  part  of  the  equation 
derived  by  the  use  of  the  Kirkwood  hypothesis  is  identical  with 
that  derived  by  Prigogine.9  The  coherence  time  expansion  of  Hurt 
and  Rice  also  leads  to  the  Rice-Allnatt  kinetic  equation.  Finally, 
using  a  functional  integral  approach  which  completely  eliminates 
the  use  of  the  Kirkwood  hypothesis,  Popielawski  and  Rice*3  have 
derived  the  Rice-Allnatt  kinetic  equation  and  shown  its  relationship 
to  a  summed  form  of  the  Prigogine  |>erturbation  theory. 

The  Rice-Allnatt  kinetic  equations  may  be  solved  analytically 
when  there  are  only  smalt  deviations  from  equilibrium.  'Fite  solu¬ 
tions,  which  depend  on  the  temperature  gradient,  velocity  gradient. 
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(d)  ion  mobility: 

<n) 

where  q  is  the  charge  on  the  ion,  mi  the  mass  of  the  ion,  and  the 
appropriate  value  of  (s  is  for  the  ion-molecule  core  interaction. 

(3)  The  friction  coefficient  £s,  which  appears  in  all  the  preceding 
formulae,  has  not  yet  been  computed  with  comparable  accuracy. 
Three  different  theoretical  estimates  are:6 

(i)  fs2=--|£/v2u(R)g(R)d6R,  (12) 

(H)  =  us(k)G(k)dk,  (13) 


with 

u“(k)=  Ju»(R)eik‘Rd*R. 

G  (k)  =  J  (g(R)  _  1]  elk  *  Rd»R, 

(iii)  =  &>u>  4* 

U3'  =  -|„Vg(,)£u  (»mkT)‘*  J‘*. 

i  r00 

J<1  =  4w»kT  I  dk[k«*os(k<*)  —  sm(kir)Ju"(k) 


•P  =  -  m»(M)  t»Hk)G(l)G(lk  - U ). 

(14) 

A  discussion  of  these  formulae  is  deferred  to  Section  3. 

In  Eqs.  (8)  —  (14),  in  is  the  mass  of  a  molecule,  a  the  hard  core 
diameter  of  a  molecule,  p  the  number  density  of  the  liquid,  g(<r)  the 
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pair  correlation  function  when  two  molecules  arc  in  contact  at  R  = 
a  and  g(R)  the  pair  correlation  function  for  a  molecular  separation 
R,  the  friction  coefficient  arising  from  the  autocorrelation  of  the 
soft  (longranged)  component  of  the  force  acting  on  a  molecule,  u 
the  intcrmolecular  pair  potential,  u8(k)  the  Fourier  transform  of  the 
soft  part  of  the  intcrmolecular  potential  and  G(k)  the  Fourier  trans¬ 
form  of  (g(R)  —  1). 

3.  COMPARISON  OF  THEORY  AND  EXPERIMENT 

We  now  consider  how  the  predictions  embodied  in  Eqs.  (3)—  (14) 
agree  with  the  available  data.  First,  however,  we  may  ask  which  data 
are  available.  In  the  case  of  simple  liquids  the  most  commonly 
measure  transport  property  is  the  shear  viscosity.3  Fewer  measure¬ 
ments  of  the  thermal  conductivity  have  been  made,  and  the  self-dif¬ 
fusion  coefficient  is  known  only  for  a  very  small  number  of  simple 
liquids:  Naghizadch  and  Rice13  studied  Argon,  Krypton,  Xenon  and 
Methane,  while  Cini-Castagnoli14  reported  one  measurement  of  D 
for  liquid  Argon  at  84.5°  K  (which  is  in  fairly  good  agreement  with 
the  measurements  of  Naghizadch  and  Rice),  and  a  few  measurements 
of  D  for  liquid  CO15.  The  dilfusion  coefficient  of  liquid  CH4  has 
also  been  deduced  from  spin  echo  measurements. 1,1  There  are  still 
fewer  experimental  determinations  of  the  bulk  viscosity.  Studies  of 
Ar  and  Na  have  been  reported  by  Naugle  and  Squire.17  The  mobili¬ 
ties  of  ions  in  liquid  Ar,  Kr  and  Xe  have  been  studied  experimental¬ 
ly  by  Davis,  Rice  and  Meyer,13  and  by  Henson11'  (Ar),  who  has  also 
measured  the  mobility  of  positive  ions  in  liquid  Nitrogen.  Henson's 
results  for  liquid  Ar  are  in  agreement  with  those  of  Davis,  Rice  and 
Meyer. 

The  Ricc-AUnait  theory  prediets  that,  at  constant  density,  the 
shear  viscosity  is  little  affected  by  changes  in  temperature,  and 
Lowry,  Rice  and  Gray56  have  shown  that,  in  view  of  the  sensitivity 
of  the  theory  to  the  imperfectly  known  radial  distribution  function, 
this  prediction  is  in  agreement  with  Zhdanova's  results  fat  liquid 
Argon31  (sec  Table  I). 

The  tenqteraiure  de|tendence  of  the  thermal  conductivity  deviates 
slightly  from  linearity  in  the  temperature  region  up  to  the  critical 
teuqteratorc,  in  the  direction  such  that  it  decreases  with  increasing 
temperature.  We  also  note  that  the  magnitude,  and  the  pressure  and 
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TABLE  I 

Shear  Viscosity  of  Liquid  Argon  (miliipoise) 


State 

90°  K 
l.Satm 

128°  K. 

50  atm 

133.5°  K 

100  atm 

185.5°  R 
500  atm 

q  (obs) 

2.30 

0.855 

0.843 

0.869 

r|  (calc)* 

1.41 

0.092 

0.701 

0.874 

q  (calc)t> 

1.67 

0.817 

0.821 

1.000 

q  (obs)  T 
q  (obs)ia,° 

— 

1.00 

1.01 

1.04 

q  (calc)  T 
q  (calc)„° 

— 

1.00 

1.005 

1.06 

a  Rice-Allnatt  theory 

b  Wei -Davis  modification  oi  Ricc  Allnati  theory. 

the  temperature  dependences  of  «,  all  decrease  in  the  following  or¬ 
der:22  CH4,  Ar,  Kr,  Xe.  Above  100  atm  the  coefficient  of  thermal  con¬ 
ductivity  seems  to  increase  almost  linearly  with  the  pressure  and  we 
mention  that,  according  to  Ikenbcrry  and  Rice22  (5k/9j>)t  for  CH4 
is  considerably  larger  than  for  Ar,  Kr  and  Xe.  These  investigators 
tested  the  Rice-Allnatt  theory  against  their  e.\|rcrimental  results 
and  found  good  agreement  ( •  10%)  (See  Table  II). 


TABLE  II 

Thermal  Conductivity  of  Liquid  Argon 
(Units  of  *  arc  10  3  cal /sec.  cm'K) 


State 

128°  K 

50  atm 

133.5°  K 

100  mm 

185.5°  K 
500  atm 

K(obs) 

18.9 

18.6 

18.7 

K(cak)‘ 

16.9 

15.9 

17.0 

K(calc)k 

125 

14.0 

17.0 

a  Ricc-Allnitt  thwry. 

b  Wci-ttavii  modiftcatiou  ol  Rice-AUuall  theory. 

Examination  of  the  Rice-Allnatt  theory  shows  that  one  of  the 
major  theoretical  problems  is  the  determination  of  the  scH-diifusion 
coefficient,  not  only  because  diffusion  is  a  purely  kinetic  phenome¬ 
non,  but  also  because  the  Rtce-AIhutt  determination  of  the  other 
transport  probities  depends  strongly  on  the  value  of  the  friction 
coefficient,  i.e.,  on  D.  The  experimental  results  of  Nagimadeh  and 
Rice13  fit  very  well  a  linear  relationship  for  the  i$obaric  temperature 
dejrendeuce  of  the  logarithm  of  D.  Nagluradeh  anti  Rice  also  ob¬ 
serve  that  the  self-diffusion  coefficient  decreases  exponentially  with 
increasing  pressure  at  constant  temperature  and  that  in  contrast  to 
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the  thermal  conductivity,  (9D/9p)T  is  much  smaller  for  CH4  than  for 
Ar,  Kr,  and  Xe.  The  calculations  displayed  in  Table  III  show  that 
none  of  the  theoretical  descriptions  of  the  friction  coefficient  is 
completely  adequate.  The  source  of  error  is  easily  traced  to  an 
inadequate  analysis  of  the  autocorrelation  function.  Eq.  (12)  cor¬ 
responds  to  a  Gaussian  autocorrelation  function,  which  cannot  re¬ 
produce  cage  effects  in  the  liquid.  The  linear  trajectory  approxima¬ 
tion,  either  without  correlation  between  soft  and  hard  forces  (Eq. 
(13))  or  with  such  correlation  (Eq.  (14))23  is  better  in  predicting  the 
magnitude  of  D,  but  the  predicted  temperature  dependence  of  D  is 
not  completely  satisfactory,  although  it  is  not  badly  in  error.  There 
is  also  one  simple  model,  the  dense  square-well  fluid,24  which  pro¬ 
vides  a  useful  zeroth  order  approximation  to  the  behavior  of  real 
fluids.  Davis,  Rice  and  Sengers  used  this  model  to  study  the  friction 
coefficient.24  Although  the  magnitude  of  the  predicted  diffusion 
coefficient  is  too  small  by  -  30%  when  one  assumes  there  is  an  ex¬ 
ponential  decay  of  the  velocity  autocorrelation  function,  the  pre¬ 
dicted  temperature  dependence  of  D  is  excellent  (see  Table  III). 


TABLE  III 

Self- Diffusion  Coefficient  for  Liquid  Argon 
(Units  of  D  arc  10-5  cm*  sec  ') 


84°  K 

90°  K 

100°  K 

D  (obs)  • 

1.84 

2.35 

3.45 

i)  (talc)* 

1.43 

1.80 

2.25 

1)  (calc)t> 

8.91 

4.1 1 

— 

I)  (calc)c 

2.25 

2.49 

— 

I)  (calc)‘< 

2.80 

3.23 

3.85 

l)  (cak)« 

2.46 

2.70 

3.34 

*  Naghiwteh  *ml  Rice 

a.  Square  Well.  exponcmUlly  decaying  correlation  function, 
ft.  Sniat*  Step  Dlffurion  Theory  (Eq.  (12)). 
c.  Small  Step,  iiotope  reparation  data  (Eq.  (IS)). 

<1.  Linear  trajectory  theory  with  no  ermt  correlations  (Eq.  (IS)), 
e.  Linear  trajectory  theory  with  inclusion  of  crow  correlation*  (Eq.  (14)). 

Buttressing  the  reality  of  this  agreement,  Davis  and  Lukssa  have  re¬ 
cently  used  the  square-well  model  wi*h  great  success  for  extensive 
computations  of  all  the  transport  projrerties  of  liquid  Ar,  Kr  and  Xe. 
Although  the  square-well  potential  is  certainly  unrealistic,  it  does 
have  the  major  features  of  a  realistic  pair  potential.  Concerning  the 
pair  potential  and  radial  distribution  function,  it  is  interesting  to 
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note  that  the  poor  agreement  between  experimental  data  and  the 
Rice  and  Kirkwood  small  step-diffusion  theory28  can  be  considerably 
improved  when  the  average  Laplacian  of  the  intermolecular 
potential  is  evaluated  from  isotope  separation  data,  as  suggested  by 
Friedman  and  Steel27  and  by  Boato,  Casanova  and  Levi28  (see  Table 
III). 

We  now  turn  briefly  to  the  study  of  ion  mobility  in  simple  liquids. 
To  date,  the  literature  on  this  subject  is  almost  entirely  limited  to 
the  experimental  and  theoretical  study  by  Davis,  Rice  and  Meyer18 
on  the  mobilities  of  positive  and  negative  ions  in  liquid  Ar,  Kr,  and 
Xe.  We  therefore  refer  the  interested  reader  to  the  details  presented 
in  the  original  papers,  and  also  to  the  monograph  by  Rice  and 
Gray.4  For  our  purposes  it  is  sufficient  to  mention  that  the  experi¬ 
mental  data  indicate  that  the  mobility  varies  linearly,  but  very 
smoothly,  with  the  external  pressure,  while  the  logarithmic  depend¬ 
ence  of  the  product  pT  can  be  represented  adequately  by  the  Ein¬ 
stein  relation.  The  magnitude,  the  pressure  dependence  and  the 
temperature  dependence  of  the  positive  ion  mobility  in  liquid  Ar, 
Kr,  and  Xe  can  be  quantitatively  accounted  for  by  the  Rice-Allnatt 
theory,  and  the  agreement  with  experiment  is  very  satisfactory  if  the 
positive  ions  ate  Ar2+,  Kr2+  or  Xe2+,  while  it  is  much  poorer  if  a 
different  ionic  species  (say,  Ar+)  is  postulated.  On  the  other  hand, 
the  study  of  negative  ions  is  much  more  difficult  because  of  impur¬ 
ity  effects.  Indeed,  Davis,  Rice  and  Meyer  interpret  their  mobility 
data  in  terms  of  the  properties  of  the  02-  ion,  and,  if  it  may  be  as¬ 
sumed  that  the  negative  charge  carriers  in  liquid  Ar,  Kr  and  Xe  are 
effectively  02~  ions,  the  Rice-Allnatt  theory  is  again  seen  to  provide 
an  adequate  representation  of  the  observations. 

Finally,  consider  the  coefficient  of  bulk  viscosity.  <f>  can  be  deter¬ 
mined  from  the  excess  ultrasonic  attenuation  (excess  over  that  due 
to  shear  viscosity  and  thermal  conductivity)  and  the  only  available 
data  are  for  liquid  Argon  and  liquid  Nitrogen.  These  measurements 
demonstrate  that;  (a)  in  the  high  temperature  region,  the  Rice- 
Allnatt  theory  gives  a  quantitative  description  of  the  density  de¬ 
pendence  and  magnitude  of  the  bulk  viscosity,  (b)  at  high  density 
and  low  temperature  the  absolute  magnitude  of  <f>  is  predicted  to 
within  ‘*'50%,  (c)  the  predicted  ratio  (<f>/y)  =  1.3  is  within  ~20% 
of  the  observed  value  of  ($/jj).  The  experimental  data  also  shows 
that  ( <t>/i j)  decreases  as  p  increases.  No  calculation  of  the  dependence 
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of  <jj  on  density  has  yet  been  made.  At  the  lowest  temperature  and 
highest  density  studied,  the  theory  is  clearly  in  disagreement  with 
experiment.  Gray  and  Rice  assign  the  error  to  inadequacy  of  the 
available  radial  distribution  function.  In  view  of  the  successes  of  the 
theory  under  conditions  where  it  is  reasonable  to  believe  that  the 
distribution  functions  are  not  too  bad,  this  conclusion  seems  valid. 
Indeed,  a  general  examination  of  the  agreement  between  theory 
and  experiment  convinces  us  that  in  all  cases  thus  far  examined  the 
major  contribution  to  the  observed  disagreement  arises  from  the 
inadequacy  of  the  available  radial  distribution  function.  It  is  clear 
that  complete  and  definitive  testing  of  the  Rice-Allnatt  theory 
awaits  the  determination  of  very  accurate  equilibrium  distribution 
functions  and  potential  functions.  The  presently  available  agree¬ 
ment  between  theory  and  observation  suggests  (but  does  not  prove) 
that  the  Rice-Allnatt  theory  is  a  good  first  order  theory  of  transport 
in  liquids. 

4.  DISCUSSION 

Since  the  space  available  is  limited  there  are  only  a  few  more 
remarks  which  I  can  make.  First,  I  wish  to  mention  that  the  theory 
of  the  autocorrelation  function,  which  is  the  most  important  part 
of  the  theory  of  the  friction  constant,  has  been  advanced  in  recent 
studies.  Using  a  memory  cunction  formalism  Berne,  Boon  and  Rice29 
have  shown  how  the  autocorrelation  function  and  power  spectrum 
of  the  linear  momentum,  including  the  effects  of  recoil,  can  be  simply 
interpreted.  This  same  formalism  can  be  used  to  generate  a  linear  tra¬ 
jectory  analysis  of  the  transport  coefficients,®0  and  a  description  of 
the  relationship  between  initial  correlations,  dynamical  memory,  and 
the  generalized  Prigogine  collison  operator.31 

Second,  since  I  have  several  times  alluded  to  the  Prigogine  theory, 
and  used  it  as  a  bench  mark  against  which  approximate  analyses 
were  measured,  it  seems  appropriate  to  give  the  following  very 
brief  description.  The  original  analysis  of  Priogogine  and  co-work¬ 
ers  used  a  Fourier  decomposition  of  the  N-body  distribution  func¬ 
tion,  /<N>,  and  the  classification  of  terms  which  appear  in  the  decom¬ 
position  according  to  powers  to  t,  N/V,  and  A.  where  A  is  the  coupling 
constant  of  the  intermolecular  potential  energy.  In  this  formal¬ 
ism  the  Liouville  equation  takes  a  form  which  describes  the  transi- 
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tions  between  different  distributions  of  wave  vectors;  the  transitions 
are  generated  by  the  interactions  between  the  molecules,  and  the 
distributions  of  wave  vectors  are  the  respective  Fourier  space  rep¬ 
resentations  of  the  distribution  function.  The  equations  are  natur¬ 
ally  ordered  in  a  sequence  which  counts  the  number  of  non-zero 
wave  vectors.  To  evaluate  the  terms  in  this  representation,  Prigogine 
and  Balescu  have  invented  a  diagrammatic  notation.5  Later,  by  the 
use  of  operator  techniques  to  solve  the  Liouville  equation,  Prigogine 
and  Resibois  have  shown  that5 

Ml  =C Do  (t,  V0>)  +  JG  00  (t  -  to  po  (to  dt'  (15) 

In  (15)  p(k)  is  the  {k}  Fourier  component  of  the  distribution 
function.  The  exact  master  equation  (15)  describes  the  time  evolu¬ 
tion  of  po  (which  is  the  velocity  distribution  function)  for  all  time. 
The  structure  of  the  equation  is  very  simple:  the  inhomogeneous 
term  Q)0  gives  the  contribution,  at  time  t,  of  the  initially  excited 
Fourier  components  which  through  interaction  decay  towards  a 
state  with  no  correlations;  the  second  term  has  a  nonlocal  structure 
so  that  9po/0t  depends  upon  p0(t')  for  times  t'  <  t,  and  is  representa¬ 
tive  of  the  fact  that  P0  in  general  changes  during  a  collision.  Note 
that  all  the  effects  of  the  initial  correlations  and  initial  conditions 
appear  in  the  term  Q0.  A  kinetic  equation  for  the  nondiagonal 
Fourier  components,  p{k)(t)  may  also  be  obtained.  For  this  and  other 
applications  the  reader  is  referred  to  the  monograph  by  Prigogine 
and  the  original  literature. 

The  first  use  to  which  the  general  master  equation  can  be  put  is 
the  examination  of  kinetic  equations  in  the  limit  t  — >  oo.  In  the 
transition  to  the  limit  it  is  seen  that  not  only  do  all  the  effects  con¬ 
tained  in  Qo  wash  out,  but  also  that  all  effects  arising  from  the  finite 
duration  of  the  binary  encounters  still  do  not  prevent  the  kinetic 
equation  being  Markovian  in  the  limit  t  -»  oo.  For  example,  the 
phenomenological  transport  coefficients  involve  only  the  asymptotic 
cross  sections,  and  no  terms  appear  which  are  related  to  the  dura¬ 
tion  of  an  encounter,  except  in  the  case  of  the  bulk  viscosity. 

The  appearance  of  the  time  convolution  in  the  generalized  master 
equation  specifically  includes  contributions  to  dp0  (t)/0t  from  p0(t') 
for  t' g  t,  with  a  weight  G0o  (t— t').  The  behavior  of  G00  (t— t') 
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is  determined  by  the  intermolecular  forces,  density,  etc.,  but  not  by 
the  initial  state  of  the  system.  Thus,  the  role  of  the  convolution  as 
such  will  only  be  important  for  times  of  the  order  of  the  interaction 
time  tc,  giving  rise  to  transient  effects.  For  t  >  >  tc,  the  velocity 
distribution  will  vary  only  a  little  during  t0,  and  the  operator 

fGoo  (t-t')dt' 
o 

will  be  approximately  independent  of  t.  The  kinetic  equation  will 
then  have  the  Markovian  form 

%=go»  K  (31) 

ot 

where  the  operator  Qao  is  given  by 
00 

gw=  j  G00  (t  -  t')  dt'.  (32) 

0 

At  this  stage,  correlations  over  distances  of  the  order  (p/ m)  tc  will 
have  been  destroyed  and  the  system  will  be  evolving  in  the  kinetic 
regime. 

This  description  bears  a  considerable  resemblance  to  the  role 
which  was  assigned  to  the  time  coarse  graining  in  the  Kirkwood 
analysis,  where,  in  order  to  develop  an  explicit  representation  of  Qqq, 
a  mechanism  for  the  interactions  was  proposed.  Equally  important 
is  the  difference  between  the  coarse  graining  proposed  by  Kirkwood 
and  the  way  in  which  Q00  is  reached.  The  simple  form  of  coarse 
graining  involves  an  unweighted  time  average,  while  Q0o  is  the  re¬ 
sult  of  a  complex  weighting  determined  by  the  nature  of  the  inter¬ 
action. 

The  method  used  to  obtain  a  master  equation  for  the  velocity 
distribution  function  may  also  be  used,  witli  slight  extension,  to 
describe  the  time  evolution  of  the  molccula**  correlations.  Again,  a 
non-local  equation  is  found  to  hold  for  all  t,  reducing  to  a  Markov¬ 
ian  equation  in  the  limit  t  ce . 

In  brief,  then: 

(a)  The  general  kinetic  equation  is  non-Markovian. 

(b)  For  times  long  compared  to  the  duration  of  an  encounter 
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(and  to  other  characteristic  times  in  more  general  cases)  it  reduces 
to  a  Markovian  equation,  which  may  or  may  not  require  correction 
for  effects  arising  from  the  finite  duration  of  an  encounter. 

(c)  For  quasi-stationary  situations,  only  the  asymptotic  form  of 
the  diagonal  fragment  enters  the  collision  operator. 

Thus,  the  form  of  the  kinetic  equation  depends  on  the  type  of 
process  being  described  and  on  the  time  scale  of  interest. 

It  is  interesting  to  compare  the  Kirkwood  coarse-graining  hypoth¬ 
esis  with  the  neglect  of  the  initial  correlations  described  by  0„ 
and  the  transition  to  Markovian  behavior.  First,  it  should  be  noted 
that  Q)0  tends  to  zero  as  t  increases  for  the  reason  that  correlations 
of  finite  extent  in  the  initial  state  can  only  interact  for  a  finite  time. 
Indeed,  if  the  range  of  the  correlations  is  of  molecular  dimensions, 
then  the  lifetime  of  the  initial  correlations  is  of  the  order  of  an 
interaction  time.*  Second,  the  effect  of  the  non-Markovian  kernel 
is  to  connect  the  distribution  function  to  itself  over  times  of  the 
order  of  the  duration  of  an  encounter.  Now  the  fundamental  idea 
involved  in  the  use  of  coarse  graining  is  that  the  dynamical  event  in 
t  is  independent  of  prior  dynamical  events.  This  means  that  on  the 
time  scale  chosen,  Q0  must  vanish  and  that  the  time  integral  involv¬ 
ing  G00  (0  must  approach  a  limit  independent  of  v.  In  section  2  we 
remarked  that  if  the  distribution  function  returned  to  the  form 
characteristic  of  the  local  environment  on  a  time  scale  short  com¬ 
pared  to  t,  then  the  process  defined  by  time  smoothing  became  a 
Markov  process.  Moreover,  for  the  case  of  a  perturbation  in  momen¬ 
tum  space,  Rice  and  Alinatt  have  shown  that  the  single  kinetic 
equation  is  consistent  with  this  condition.  It  is  clear  that,  in  effect, 
the  calculation  of  the  relaxation  time  for  a  perturbation  in  momen¬ 
tum  space  is  equivalent  to  the  calculation  of  the  lifetime  of  the 
correlations  in  a  specified  initial  state.  The  consistency  in  this  re¬ 
gard  shows  that  0\  can  be  neglected  under  the  conditions  de¬ 
scribed  by  the  Rice-Allnatt  equation  and  that  the  use  of  time 
coarse  graining  does  lead,  as  expected,  to  Q}0  —  0.  Of  course  this 
is  shown  only  for  a  special  case,  but  the  physical  description  is  clear 
enough  that  the  argument  can  be  extended,  For  some  states  (Ja 

*  The  lifetime  of  the  wmeUtton*  is,  in  fact,  of  the  onfctr  of  a  tchsatuut  time. 
Xcvtttheks*.  the  tksuutiton  te*m  vanishes  in  an  intet  action  time. 
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cannot  be  neglected  (spin-echo  experiment)  and  each  situation  must 
be  separately  analyzed.  It  is  safe  to  conclude  however,  for  the  liquids 
we  have  considered,  that  coarse  graining  does  lead  to  an  equation 
from  which  all  information  about  the  initial  correlations  has  been 
removed. 

Now,  both  the  Hurt-Rice  and  Popielawski-Rice  analyses  show 
that  die  Rice-Allnatt  equation  is  the  first  term  in  a  well  defined 
approximation  scheme  in  which: 

a)  The  dynamical  effects  of  a  strong,  short  ranged  repulsion  be¬ 
tween  the  molecules  is  approximated  by  successive  uncorrelated 
quasi-binary  encounters, 

b)  The  dynamical  effects  of  a  weak,  longer  ranged  attraction  arc 
included  to  all  orders  of  perturbation  theory, 

c)  The  effects  of  the  weak  interaction  on  the  dynamics  of  the 
quasi-binary  encounter  are  neglected. 

If  the  Rice-Allnatt  equation  is  compared  with  the  Prigogine 
theory,  what  is  included  and  what  is  left  out?  The  derivation  of 
the  Ricc-Allnat  equation  shows  that  it  includes  contributions  from: 

a)  all  terms  leading  to  the  weak  master  equation, 

b)  all  terms  iu  the  binary  collision  expansion  corresponding  to 
uncorrelated  successive  binary  collisions. 

Terms  not  mentioned  under  (a)  or  (b)  arc  not  totally  neglected. 
The  use  of  the  local  equilibrium  approximation  (and  the  explicit 
refusal  to  expand  /‘a*  in  a  power  series  in  p)  is  equivalent  to  the 
assumption  that  all  remaining  terms  iu  the  expansion  arc  replaced 
by  the  corresponding  equilibrium  terms.  Thus,  the  dynamics  of  the 
pair  of  molecules  is  influenced  by  the  stationary  field  of  the  sur¬ 
rounding  molecules,  but  the  reaction  of  the  surrounding  molecules 
to  the  motion  of  the  pair  of  molecules,  and  the  instantaneous  effect 
of  the  motion  of  the  surroundings  on  the  pair  of  molecules  are  ne¬ 
glected.  The  local  equilibrium  approximation  can  be  thought  of 
as  replacing  the  N  hotly  dynamical  problem  bv  a  two  body  prob¬ 
lem  with  boundary  conditions  specified  in  terms  of  the  equilibrium 
distribution  of  the  other  molecules  of  the  system. 

We  conclude  that  there  is  both  theoretical  and  experimental 
basis  for  the  belief  that  the  Rice-Allnatt  kinetic  equation  is  a 
reasonable  zeroth  order  description  of  a  simple  liquid. 
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B.  Studies  of  the  Electronic  States  of  Simple  Liquids 
1.  INTRODUCTION 

Considerable  effort  has  been  devoted  to  studies  of  the  electronic 
states  of  free  molecules  and  of  crystals.  Consider  first  the  sjiectrum  of 
bound  states.  Studies  of  dilute  gases,  imm  which  information  about 
the  free  molecule  is  deduced,  are  simplified  by  the  absence  of  inter- 
molecular  interactions  and  hence  of  correlations  between  the  posi¬ 
tions  of  the  molecules.  Thus,  any  one  molecule  may  lie  regarded  as 
isolated  except  for  occasional  binary  collisions.  Binary  collisions, 
which  decrease  in  frequency  as  the  gas  density  is  decreased,  lead  to 
a  broadening  and  shift  of  the  sationary  states  of  the  gas,  and  from 
this  alteration  of  the  spectrum  there  can  lie  deduced  information 
about  the  intcrmolecuiar  potential.1  A  different  simplification  of 
description  is  |iossible  in  the  case  of  crystalline  solids.  The  long 
range  order  of  a  crystalline  lattice  is  a  consequence  of  strict  trans- 
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lalionai  symmetry.  In  turn,  the  existence  of  strict  translational  sym¬ 
metry,  as  well  as  the  presence  of  other  geometric  symmetries  within 
the  unit  cell,  permits  the  use  of  a  description  in  which  independent 
collective  coordinates  arc  fundamental  to  the  representation.  These 
collective  coordinates,  of  course,  correspond  to  the  exciton  states 
of  the  crystal.  Interaction  between  the  exciton  states  and  lattice  vi¬ 
brations  leads  to  some  alteration  of  the  spectrum,  and  useful  in¬ 
formation  may  thereby  be  obtained.2 

Similar  considerations  may  lx:  used  to  describe  the  conduction 
electron  states.  In  a  gas,  simple  kinetic  theory,  assuming  binary  col¬ 
lisions,  is  sufficiently  accurate.  In  a  crystal,  an  electron  may  be  de¬ 
scribed  as  interacting  with  collective  vibrational  modes,  the  pho¬ 
nons.  In  both  cases,  the  scattering  leading  to  finite  electron  mobility 
is  readily  interpreted  in  terms  of  a  simple  interaction  potential  and 
geometric  configuration. 

Now,  a  liquid  has  short  range  structural  order  but  no  long  range 
structural  order.  Because  of  the  strong  interactions  between  the 
molecules  in  a  liquid,  approximations  suitable  to  the  description  of 
a  gas  are  not  useful.  Furthermore,  the  lack  of  a  simple  geometric 
symmetry  in  the  short  range  order  of  the  liquid  makes  it  necessary 
that  the  concepts  of  the  exciton  theory  of  crystals  and  the  band 
theory  of  conduction  be  extensively  modified  before  they  can  be 
applied  to  the  description  of  a  liquid.  It  is  my  opinion  that  the 
development  of  a  physically  realistic  and  incisive  interpretation  of 
the  electronic  states  of  liquids  and  of  other  disordered  systems  is 
or.e  of  the  most  interesting  and  most  challenging  of  contemporary 
scientific  problems.  One  outgrowth  of  this  contention  is  a  program, 
initiated  a  few  years  ago  at  the  University  of  Chicago,  to  study  the 
electronic  states  of  simple  liquids.  This  report  presents  a  very  short 
resume  of  some  of  our  work  on  tin:  electronic  properties  ol  simple 
monoatomic  dielectric  liquids,  together  with  some  comments  on 
what  we  now  understand,  do  not  understand,  and  where  new  con¬ 
cepts  ami  constructs  are  needed. 

2.  EXCESS  ELECTRON  STATES  IN  MONOATOMIC 
DIELECTRIC  LIQUIDS 

a.  Fwat  Ei  cctkon  States 

We  consider,  first,  the  proftertics  of  au  excess  electron  in  a  liquid 


PROPERTIES  OF  THE  LIQUID  STATE  269 


composed  of  neutral,  closed  shell  atoms  (for  example,  Helium,  Ar¬ 
gon,  Krypton).  Because  the  interaction  between  an  electron  and 
a  neutral  atom  is  both  weaker  and  of  shorter  range  than  the  inter¬ 
action  between  an  electron  and  an  ion,  it  is  tempting  to  suppose 
dial  an  excess  electron  in  a  simple  liquid  behaves  very  much  like 
a  free  electron.  Of  course,  the  fact  the  electron-atom  interaction  is 
non-zero  suggests  that  scattering  phenomena  cannot  lie  totally  ne¬ 
glected,  and  that  the  free  electron  s]iectrum  of  states  will  be  per- 
turbed  by  the  presence  of  fluid  atoms. 

Given  presently  available  technology  and  the  very  low  level  at 
which  excess  electrons  may  be  introduced  into  a  simple  dielectric 
liquid,  most  of  the  available  exjierimental  data  on  the  properties 
of  excess  e.ectrons  are  obtained  from  mobility  measurements.  Re¬ 
ferring  the  reader  elsewhere  for  detailed  descriptions  of  the  expert- 
mental  techniques,3  we  display  in  Figs.  1,  2,  and  3  the  results  of 
mobility  measurements  made  in  a  time  of  flight  instrument  at  very 
low  electron  concentrations  (103  cm3  -  10-,!*  molar).  These  data 
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clearly  show  how  the  mobility  of  an  excess  electron  in  liquid  Ar 
is  decreased  by  increasing  temperature  (for  T  <  115°K)  and  in¬ 
creased  by  increasing  pressure.  Also,  for  ^  <  115°K,  the  drift 
velocity  is  proportional  to  the  electric  field  strength. 
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p  (atmospheres) 

Figure  3.  Pressure  dependence  of  the  mobility  of  electrons  in  liquid  Ar. 


Given  the  information  cited,  what  can  be  said  about  the  excess 
electron  states  of  a  simple  liquid?  We  consider  the  development  of 
a  microscopic  theory  to  explain  these  data.  Clearly,  there  are  (at 
least)  wo  major  problems  to  be  solved: 

(a)  What  is  the  effective  atomic  potential  scattering  the  electron? 

(b)  Given  the  scattering  potential,  how  do  we  determine  the  elec¬ 
tron  mobility  and  other  transport  properties? 

An  electron  a  distance  R  front  an  isolated  atom  of  polarizability 
a  induces  on  it  a  dipole  of  strength  ae/R2  which,  in  turn  attracts 
the  electron  with  a  force  of  magnitude  Uaey'R8.  Because  of  other 
electron-atom  forces,  the  interaction  of  an  atom  and  electron  docs 
not  increase  indefinitely  as  R  -»  0.  A  convenient  form  for  the  elec¬ 
tron-atom  potential  is  u«=  -  ae2/2(R2+R«2)2,  where  Ra  is  a 


272  JOURNEYS  IN  SCIENCE 


meastuc  of  the  strength  of  short  ranged  correlation  and  exchange 
forces.  The  value  of  R«  may  be  determined  from  the  electron-atom 
scattering  cross  section  in  the  limit  of  zero  incident  energy  of  the 
electron  (scattering  length).  Lekner4  has  shown  that  when  R/»  is 
fixed  in  this  fashion,  the  electron-atom  momentum  transfer  cross 
section  in  gaseous  Ar  is  very  accurately  reproduced  up  to  electron 
energies  of  several  volts. 

What  is  the  polarization  interaction  between  an  electron  and  a 
particular  atom  in  a  liquid?  To  answer  this  question  we  must  find 
the  local  field  acting  on  the  atom,  which  consists  of  the  direct  field 
and  the  sum  of  all  other  fields  arising  from  dipoles  induced  on 
neighboring  atoms.  The  problem  is  simplified  by  the  fact  that 
vA  <<  v,.  <<  vAl„  where  vA,  v,.  and  vAl.  are  typical  velocities  of 
atoms  in  the  liquid,  of  excess  electrons,  and  of  the  bound  atomic 
electrons.  Because  of  the  large  differences  between  these  velocities, 
the  motion  of  the  atoms  can  be  ignored  in  calculating  the  mutual 
screening  effect  of  neighbors,  and  the  motion  of  the  excess  elec¬ 
trons  may  be  ignored  in  calculating  the  induced  polarizations  of 
the  atomic  electrons. 

Consider,  now,  a  point  charge,  — e,  in  a  liquid  composed  of  atoms 
of  polarizability  a.  In  the  absence  of  other  nearby  atoms  the  electric 
field  acting  on  an  atom  at  R  would  be  e/R*  (see  Fig.  1).  We  define 
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the  average  local  field  acting  on  the  atom  at  R,  and  along  R,  by 
(e/R2)  f  (R).  But  this  local  field  is  equal  to  the  direct  field  (e/R2), 
plus  the  contribution  to  the  field  arising  from  all  the  other  induced 
dipoles  in  the  liquid.  Given  that  an  atom  is  located  at  some  point 
in  the  liquid,  the  probability  of  finding  another  atom  in  the  volume 
element  dt  at  distance  s  from  the  first  is  defined  to  be  P  g  (s)  dt,  where 
p  is  the  number  density  of  atoms  and  g(s)  is  the  pair  correlation 
function  (radial  distribution  function).  The  field  acting  on  this  sec¬ 
ond  atom  is  (c/t2)  f  (t),  so  that  it  carries  an  induced  dipole  of  aver¬ 
age  strength  a  (e/t2)  f  (t).  After  calculation  of  the  component  along 
R  of  the  field  at  R  arising  from  this  dipole,  and  integration  over 
all  possible  positions  of  the  second  atom,  it  is  found  that  f(t)  must 
satisfy  a  linear  integral  equation.4  The  solution  of  that  linear  in¬ 
tegral  equation  gives  the  required  self-consistent  ensemble-averaged 
local  field.  The  local  field  for  a  realistic  liquid  structure  is  shown  in 
Fig.  (5).4  The  variation  of  shielding  with  distance  is  particularly 
noteworthy.  It  is  easy  to  show  that  the  screening  effects  are  con¬ 
tained  entirely  in  the  local  field  and  therefore  that  the  electron- 
atom  polarization  interaction  in  the  liquid  is  n<*  —  —  cte2f™  / 
(R2  -f-  Ra2)2.  Of  course,  the  total  electron-atomic  potential  is  a  sum 
of  the  polarization  and  atomic  potentials.  For  a  liquid  like  Ar  the 
atomic  potential  is  little  influenced  by  the  state  of  aggregation,  and 
we  may  assume  that  Ra  is  the  same  in  the  gas  and  the  liquid. 

Because  of  the  overlapping  of  potential  fields  in  the  liquid,  the 
electron  is  never  in  field-free  space.  Now,  the  average  potential  near 
an  atom  at  Rt  is  the  sum  of  the  atomic  field  centered  at  Ri  and  the 
average  over  the  positions  of  all  other  atoms.  The  effective  potential 
for  the  electron  scattering  is  then  defined  by  the  difference  between 
the  instantaneous  potential  and  the  average  potential.  As  shown  in 
Fig.  6,  we//  is  very  much  weaker  and  very  much  more  slowly  varying, 
than  is  the  scattering  potential  in  the  gas  phase.  With  the  calcula¬ 
tion  of  uull,  we  have  answered  question  (a)  to  the  accuracy  required 
for  our  present  purposes. 

To  answer  question  (b)  we  adopt  the  single  scattering  approxima¬ 
tion,  that  is,  the  scattered  amplitude  of  the  electron  at  any  point  is 
the  coherent  sum  of  amplitudes  scattered  from  individual  atoms, 
with  neglect  of  the  sum  of  amplitudes  multiply  scattered  from  dif¬ 
ferent  atoms.  The  wave  incident  on  each  atom  is  a  wave  packet 
which  is  approximated  by  a  plane  wave.  The  single  scattering  ap- 
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Figure  5.  (a)  The  shielding  function  f(R)  in  liquid  Ar. 

The  dashed  line  refers  to  the  interaction  of  a  point  charge 
with  neutral  atoms,  the  solid  line  to  the  interaction  of  a 
charge  with  diameter  equal  to  the  atomic  diameter. 

proximation  is  valid  when  the  mean  free  path,  a,  is  large  com¬ 
pared  with  the  de  Broglie  wavelength  of  the  wave  packet. 

As  a  result  of  scattering,  a  particle-wave  moving  through  a  liquid 
transfers  energy  to  all  collective  excitations  of  the  system.  In  the 
single  scattering  approximation  all  the  necessary  information  about 
the  excitations  is  contained  in  the  function  that  describes  the 
probability  of  finding  a  molecule  at  R'  at  time  t',  given  that  a 
selected  molecule  was  at  R  at  time  t.  This  function  is  called  the 
Van  Hove  space-time  pair  correlation  function,  and  its  Fourier 
transform,  denoted  S(k,  o>),  is  called  the  spectral  function.5  The 
probability  of  an  electron  scattering  with  loss  of  momentum  tik  and 
energy  ftw,  say  by  creating  a  density  fluctu  .tion  of  momentum  hk 
and  energy  hoi,  is  proportional  to  the  product  of  the  single  atom- 
electron  differential  scattering  cross  section  and  the  spectral  function. 
Now,  the  spectral  function  has  certain  general  properties  which  are 
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Figure  5.  (b)  The  effective  dielectric  constant  as  a  function  of  charge- 
charge  separation  in  liquid  Ar. 

of  use  to  us.  First,  the  probability  of  scattering  with  momentum 
transfer  ftk,  averaged  over  all  possible  energy  transfers,  is  propor¬ 
tional  to  the  structure  factor  S(k),  which  is  itself  just  the  Fourier 
transform  of  the  static  excess  pair  correlation  function  G  (R,  o) 
[G  (R,  o)  =3  g  (R)  —  1].  This  is,  then,  just  the  same  result  as  em¬ 
bodied  in  the  familiar  formula  for  the  intensity  of  X-ray  scattering 
from  a  liquid.6  Second,  the  average  energy  transfer  in  an  interaction 
is  exactly  equal  to  the  free  atom  recoil  energy  for  the  same  momen¬ 
tum  transfer,  independently  of  structure  or  thermal  motion.  Third, 
it  may  be  shown  that  the  mean  square  energy  transfer  is  only  ap¬ 
proximately  structure  independent.  The  mean  square  energy  trans¬ 
fer  is  structure  independent  if  we  neglect,  relative  to  the  mean 
energy  transfer,  terms  which  are  smaller  by  the  ratio  of  the  energy 
transferred  in  a  collision  to  the  thermal  energy  of  an  ate..  ~n€/ 
Mk„T,  and  hence  negligible  for  our  case. 

The  conditions  cited  can  now  be  used  to  derive  a  kinetic  equation 
descriptive  of  the  electron-liquid  system.  The  important  idea  is  that 
because  the  electron  atom  mass  ratio,  m/M,  is  so  small,  an  electron 
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Figure  6.  The  effective  potential  acting  on  an  electron  in  liquid  Ar. 

colliding  elastically  with  an  atom  undergoes  large  deflections,  but 
suffers  only  very  small  changes  in  energy.  It  is  found  that  the  rate  of 
transfer  of  energy  is  determined  by  a  mean  free  path 

A^1  =  2?r p  j  sine  (1  —  cose)  <r  (€,e)  de  (1) 

o 

which  is  independent  of  structure,  while  the  rate  of  transfer  of  mo¬ 
mentum  is  determined  by  a  mean  free  path 

Ap1  =  2vp  f  sine  (1  —  cose)  <x  (€,8)  S(k0)de 


(2) 
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which  does  depend  on  the  liquid  structure.  For  thermal  electrons, 
for  which  k  0  2  the  efficiency  of  energy  transfer  is  greater  than 
that  of  momentum  transfer  by  the  factor  1/S(0),  which  is  very  large 
in  a  liquid.  Thus,  an  electron  in  a  liquid  comes  to  thermal  equili¬ 
brium  very  fast,  and  hot  electron  effects  require  large  applied  fields. 

The  predicted  drift  velocity,  as  a  function  of  field  strength,  is 
displayed  in  Fig.  7.  Good  agreement  between  theory  and  experiment 
is  obtained.4 

A  different  test  of  the  theory  is  provided  by  measuring  the  work 
required  to  inject  an  electron  into  the  liquid.  The  difference  in  en¬ 
ergy  resulting  from  the  transfer  of  an  electron  from  the  vacuum  to  a 
liquid  is  just  the  mean  potential  acting  on  the  electron  in  the 
liquid.  By  measuring  the  work  function  for  emission  of  electrons 
into  vacuum  and  into  a  liquid,  Lekner,  Halpern,  Gomer,  and  Rice7 
find  (see  Fig.  8)  that  the  energy  in  the  liquid  is  —0.33  eV  relative  to 
the  vacuum.  The  theoretical  calculations  sketched  lead  to  the  pre¬ 
diction  that  this  energy  change  is  —0.46  eV. 

The  preceding  considerations  suggest  that  a  conduction  electron 
in  liquid  Ar  is  nearly  free,  and  that  the  scattering  can  be  described 
in  terms  of  the  combined  effects  of  the  collective  polarization  field 
and  the  superposed  atomic  fields,  at  least  in  the  first  approximation. 
As  we  shall  see,  a  necessary  condition  for  the  validity  of  this  deduc¬ 
tion  is  that  the  electron-atom  scattering  length  be  sufficiently  small 
that  geometric  reorganization  in  the  liquid  is  not  energetically  fa¬ 
vored. 

b.  Bound  Excess  Electron  States 

When  the  properties  of  an  excess  electron  in  liquid  Me  are  stud¬ 
ied,8  it  is  found  that  the  electron  mobility  is  very  much  less  than 
anticipated,  i.e.,  of  the  order  of  10-'  cm3  sec1  volt’1.  Furthermore,  a 
study  of  the  electron  mobility  as  a  function  of  density  reveals  that 
there  is  a  drastic  change  as  the  density  is  increased  towards  the 
liquid  density  (see  Fig.  5)).“  Examination  of  the  electron-atom  inter¬ 
action  reveals,1"  in  this  case,  a  very  strong  repulsion  (see  Fig.  10). 
which  in  turn  suggests  that  the  quasi-free  electron  configuration  may 
be  of  higher  free  energy  than  other  configurations.  What  other  con¬ 
figuration  might  be  more  stable  than  the  Iree  electron  state?  An 
obvious  possibility  is  that  the  electron-atom  repulsion  is  strong 
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Figure  7.  Drift  velocity  of  electrons  in  liquid  Ar  as  a  function  or  the  ap¬ 
plied  field  strength.  The  solid  curve  is  the  prediction  of  the  theory  de¬ 
scribed  in  the  text. 


tw  (eV) 


6.  Photo-injected  electron  current  in  liquid  Ar  a*  a  function  of  the 
illuminating  pftoton  energy,  the  surface  from  which  the  electrons  come 
is  tta-Cs. 
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energy  to  create  a  void  around  die  center  of  the  electron  wave  jack¬ 
et.  Ilte  work  required  to  create  Urn  bubble  depends  on  the  volume 
of  the  bubble,  the  effective  surface  tension  ol  the  liquid  (new  stir* 
face  area  is  created  when  a  bubble  is  formed)  and  the  increase  in 
electron  kinetic  energy  because  ol  localisation  inside  the  bubble. 


(  "  o)  «a  +  /\ 
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Of  course,  the  electron-atom  repulsion  is  reduced  as  a  result  of  en¬ 
closing  the  electron  in  a  void,  and  whether  or  not  bubble  forma¬ 
tion  is  thermodynamically  favored  depends  on  the  balance  between 
all  the  factors  mentioned.  The  bubble  model  was  first  suggested 
by  Careri  and  Feynman11  and  has  since  been  studied  by  several 
other  investigators.  The  most  sophisticated  calculations  are  those 
of  Hiroike,  Kestner,  Rice,  and  Jortner,12  who  use  the  mathematical 
isomorphism  between  a  pair  product  form  of  the  wave  function  and 
the  pair  distribution  function  of  a  classical  liquid  in  an  external 
field  to  avoid  the  introduction  of  a  surface  tension,  etc.  All  calcula¬ 
tions  are  in  agreement  that: 

(a)  Void  formation  is  only  favored  at  high  density  (see  Fig.  11). 


Fi&um;  II.  The  free  energies  of  a  free  electron  and  a  bound  (bubble) 
election  in  He  as  a  function  of  density.  Note  that  the  density  at  which  the 
tU'vcs  truss  is  close  to  the  density  marked  by  an  arrow  in  Fig.  9. 
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(b)  The  qualitative  features  of  the  trapped  state  are  independent 
of  whether  or  not  the  bubble  boundary  is  fuzzy  or  sharp  (see  Fig. 
12)  and  of  the  gross  magnitude  of  the  surface  energy  (see  Fig.  13). 


f-V*Viu  ItJ.  The  dint  «il  aiming  the  bubble  houtulsm  tic-twin  (Cum 
bwimlatV)  on  the  stability  of  the  bubble  ronfi^ut^tiati  in  Hr. 
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I  tM  ut  IS.  The  etfm  nf  snrfate  tcuoon  on  the  liability  <»t  the  bubble  ton* 
fegutatton  in  liquid  He. 


tit  tlu?  tabulations  of  joitnei.  Rice,  Kenner,  Hirrn&e.  amt  Cohen48 
the  electron-atom  interaction  was  calculated  from  f«eudo(*otent*at 
theory.  In  quantitative  letrat,  these  calculation*  for: 

(c)  the  mobility  of  the  electron  and  its  frCesstttc  and  temperature 
dependence.43 

(d)  I  he  energy  required  to  inject  an  electron  into  liquid  He 
(theory  1 .0  cV,  observed  1.1  eV),u 
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(e)  The  density  at  which  the  transition  from  delocalized  to  local¬ 
ized  electron  states  occurs  (see  Fig.  9), 

(f)  The  size  of  the  void  as  measured  by  electron-vortex  line  trap¬ 
ping  experiments.15 

Thus,  in  contrast  to  the  case  of  liquid  Ar,  in  the  case  of  liquid  He 
the  excess  electron  states  are  nothing  at  all  like  free  electron  plane 
wave  states.  We  see,  then,  that  the  excited  states  of  a  simple  liquid 
are  complex  and  depend  very  strongly  on  the  nature  of  the  electron- 
atom  interaction.  It  is  just  because  the  lack  of  rigid  lattice  permits 
easy  geometric  readjustment  of  the  local  structure  that  the  electronic 
states  are  so  strongly  coupled  to  the  translational  states  of  the  liquid. 
We  may  anticipate  a  rirh  variety  of  behavior  of  excess  electron  (con¬ 
duction  electron)  states,  dependent  on  details  of  the  electron-mole¬ 
cule  interaction  and  perhaps  on  the  nature  of  the  internal  states  of 
the  host  molecules.  We  are  only  at  the  very  threshold  of  understand¬ 
ing  what  these  states  are  like  or  how  they  behave  under  external 
perturbation. 

Thus  far  in  Sections  2a  and  2b  we  have  discussed  those  properties 
of  the  conduction  electron  states  which  .  re,  at  least  qualitatively, 
understood.  Examples  of  phenomena  that  are  not  understood  in¬ 
clude  the  following: 

(a)  In  liquid  Ar  above  T  =  1I5°K  and  in  liquid  Kr  for  all  tem¬ 
peratures  thus  far  studied,  the  temperature  dependence  of  the  drift 
velocity  is  opposite  to  that  expected  (see  Figs.  14  and  15).3  If  it  is  a 
valid  argument  that  the  polarization  field  in  the  liquid  sums  to  an 
almost  constant  field  which  does  not  influence  the  scattering,  one 
cannot  invoke  a  Ramsauer  effect  to  explain  these  data.  The  fact 
that  the  drift  velocity  depends  on  the  field  strength  more  than 
linearly  suggests  that  some  feature  of  the  scattering  process  is  not 
properly  accounted  for  in  the  analysis  thus  far  developed. 

(b)  There  is  observed  to  be  a  transition  from  localized  excess 
electron  states  to  delocalized  excess  electro-:  states  in  systems  such 
as  Na  in  molten  NaCl.16  The  nature  of  this  transition  is  intimately 
related  to  the  interplay  between  electron-electron  and  electron-ion 
(or  atom)  interactions.  Many  theoretical  studies  of  the  non-metal- 
metal  transition  have  been  made,17  and  there  is  relevant  experi¬ 
mental  data  from  studies  of  impurity  conduction  in  doped  semi¬ 
conductors.18  However,  these  studies  do  not  describe  the  role  of  the 
local  ion  or  atom  structure,  and  more  particularly  the  possibility  of 
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tures  of  the  doped  solid  and  liquid  are  as  great  as  those  between  a 
pure  solid  and  liquid.  There  are  also  studies  of  the  properties  of 
liquid  metals,17  but  these  use  approximations  not  necessarily  valid 
in  dielectric  liquids. 

The  solution  of  these  and  other  problems  may  require  new  con¬ 
structs  and  lead  to  new  and  more  incisive  understanding  of  the  con¬ 
duction  electron  states  of  all  media.  The  problems  which  will  be 
raised  when  systems  with  internal  states,  e.g.,  polyatomic  molecules, 
polymers,  etc.,  can  now  only  be  foreseen  with  difficulty,  but  they 
too  should  be  of  great  intrinsic  interest  and  value  in  the  construc¬ 
tion  of  a  complete  theory  of  the  electronic  properties  of  matter. 

3.  EXCITON  STATES  IN  A  MONOATOMIC  DIELECTRIC 

LIQUID 

We  now  examine  some  of  the  properties  of  the  bound  excited 
states  of  a  simple  liquid.  It  is  convenient  to  consider  several  interre¬ 
lated  questions: 

(a)  Do  exciton  states  exist  in  a  liquid?  If  such  states  do  exist, 
what  is  the  nature  of  the  spectrum? 

(b)  How  do  intermolecular  interactions  alter  the  spectrum  of  sta¬ 
tionary  states? 

(c)  How  is  energy  transferred  in  a  liquid? 

(d)  What  is  the  nature  of  the  relationship  between  the  bound 
electron  states  and  free  electron  states  in  the  liquid? 

Clearly,  questions  (a)  and  (b)  are  intimately  coupled  to  one 
another.  Nevertheless,  it  is  useful  to  proceed  by  first  considering  the 
properties  of  the  bound  states  in  the  absence  of  scattering,  and  then 
to  examine  how  the  spectrum  of  states  is  altered  by  scattering  pro¬ 
cesses. 

a.  Stationary  States  in  the  Absence  of  Scattering 

From  the  most  general  point  of  view,  it  may  be  argued  that  on  the 
scale  of  length  determined  by  the  wavelength  of  typical  electronic 
transitions,  both  liquids  and  solids  display  translational  symmetry. 
Indeed,  it  is  only  for  distances  of  the  order  of  5-50  A  that  differ¬ 
ences  in  the  local  geometries  of  liquids  and  solids  are  obvious.  Thus, 
provided  that  the  wavelength  of  an  incident  electromagnetic  wave 
is  large  relative  to  the  range  of  molecular  ordering,  localized  excita¬ 
tions  at  two  points  in  the  liquid  are  related  by  the  phase  factor  exp 
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(i  k*R),  where  k  is  the  excitation  propagation  vector  and  R  the 
vector  separation  of  the  two  points.  We  shall  see  later  that  even  in 
the  absence  of  scattering,  the  disorder  in  the  medium  leads  to 
damping,  but  this  can  be  shown  to  be  small  in  the  limit  that  |  k  | 
is  very  small  relative  to  the  reciprocal  of  the  near  neighbor  distance. 

To  describe  the  internal  structure  of  a  possible  exciton  state  we 
utilize  the  deductions  of  Section  2a;  that  is,  we  assume  that  an 
electron  is  nearly  free  in  Ar  and,  therefore,  that  the  wave  function 
of  a  conduction  electron  in  liquid  Ar  is  adequately  represented  by 
a  plane  wave.  It  is  also  convenient,  but  not  necessary,  to  assume  that 
the  hole  (ion  core)  is  stationary.  Although  the  hole  will  have  finite 
mobility,  in  general  it  moves  much  more  slowly  than  docs  the  elec¬ 
tron.  Now  the  simplest  approximation  to  the  dynamics  of  the  hole- 
electron  pair  is  the  following.  A  wave  packet  describing  the  electron 
is  constructed  by  superposing,  with  appropriate  coefficients,  the 
free  electron  plane  wave  eigen-functions.  The  Hamiltonian  of  the 
electron  hole  pair  is  then  represented  as  a  sum  of  the  free  electron 
Hamiltonian  and  the  screened  coulomb  interaction.  It  is  then 
readily  shown  that18 

^•A  <k)  +  J  3  w  A  (V)  =  E  A  (k)  (3) 

where  A(k)  is  the  coefficient  of  the  plane  wave  |k>  in  the  electron 
wave  packet,  and  u  ky  is  the  Fourier  transform  of  the  screened  cou¬ 
lomb  potential.  If  u  were  a  simple  coulomb  potential,  (14)  would 
be  the  momentum  representation  of  the  hydrogenic  Schrocdinger 
equation  for  the  amplitudes  A(k).  In  this  limit,  the  manifold  of 
levels  is  hydrogenic,  and  the  amplitudes  in  the  wave  packet  expan¬ 
sion  satisfy  a  hydrogenic  equation.  The  results  of  Section  2a  clearly 
show  that  u  is  not  a  coulomb  |M>tential,  and  therefore  the  energy 
level  structure  deviates  from  the  hydrogenic  structure.  Using  die 
screened  coulomb  interaction  appropriate  to  liquid  Ar,  obtained 
by  methods  similar  to  those  described  earlier,  the  eigenstates  of  Eq. 
(14)  having  s  symmetry  have  been  determined.3*  Some  calculated 
charge  densities  are  shown  in  Fig.  I(>  together  with  the  corre$|K)nd- 
ing  hydrogenic  charge  densities.  Clearly,  the  shifts  in  charge  den¬ 
sity  arc  just  those  to  be  expected  from  the  modified  form  of  die 
coulomb  interaction. 
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The  preceeding  simple  calculations  in  no  way  elucidate  the  de¬ 
pendence  of  the  energy  of  the  exciton  on  the  propagation  vector, 
called  die  dispersion  relation.  To  study  the  dispersion  relation  we 
resort  to  a  different  set  of  considerations.  Suppose  some  one  mole¬ 
cule  in  the  liquid  is  excited,  and  Uiat  there  is  a  nonvanishing  dipole 
matrix  element  between  the  excited  state  and  the  ground  state  ol 
the  molecule.  Then,  because  all  die  molecules  in  the  liquid  are  iden¬ 
tical,  and  each  of  the  excited  states  of  the  N  molecules  is  N-fold 
degenerate,  excitation  energy  may  be  transferred  f**om  molecule  to 
molecule.  This  resonance  transfer  of  energy  may  be  described  in 
terms  of  die  coupling  between  the  transition  dijrolc  moments  of 
the  molecules.  Because  of  the  long  range  of  the  dipole  interaction, 
dipolar  motion  is  organized  into  collective  dipolar  polarization 
waves.  In  turn,  the  polarization  waves  may  be  interpreted  as  the 
classical  equivalent  of  an  exciton  held.  It  may  be  shown,  for 
example,  that  the  van  dcr  Waals  energy  of  a  medium  of  turnover- 
lapping  molecules  arises  from  the  shift  in  the  spectrum  of  dipolar 
fh  ctuations  under  the  influence  of  their  mutual  interactions.31  In 
making  diis  last  statement  we  have  emphasized  that  the  molecules 
concerned  must  be  such  that  there  is  vanishingly  small  overlap  of 
the  electronic  wave  functions  in  both  the  ground  state  and  the  ex¬ 
cited  states.  If  there  is  overlap  between  the  molecules  when  one  is 
excited,  additional  interactions  not  included  (e.g.,  charge  transfer) 
must  be  of  some  importance  in  determining  the  cohesive  energy  and 
other  properties  of  the  system.  In  the  following  we  confine  atten¬ 
tion  to  the  case  that  overlap  of  electronic  wave  functions  is  vanish¬ 
ingly  small. 

To  compute  the  exciton  disjtersion  curve  in  a  simple  liquid  Ni- 
colis  and  Rice33  make  use  of  the  analogy  between  the  classical  polar¬ 
ization  field  and  an  exciton  field.  They  consider  an  assembly  of  N 
atoms  (molecules)  its  a  volume  V,  and  represent  each  atom  by  a 
Drude  atom,  i.e.,  each  atom  is  assumed  to  have  s  bound  electrons  of 

charge  e  fll\  (n  =1,2 . s).  Furthermore,  each  of  the  electrons 

is  assumed  to  undergo  undamped  harmonic  oscillation  about  the 
nucleus  with  frequencies  w*  (n  =  I*  2,  .  .  s).  'Il«c  connection  be¬ 

tween  this  classical  model  and  the  correct  quantum  mechanical  de¬ 
scription  is  established  by  requiring  fa  to  Ire  the  oscillator  strength 
corresponding  to  the  transition  of  interest.  To  complete  the  sjtctifi- 
cation  of  the  model  system  we  must  describe  the  nature  ol  the  in- 
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tcratomic  interactions.  Nicolis  and  Rice  assume  that  tire  total  po¬ 
tential  energy  of  interaction  consists  of: 

(a)  A  dipole-dipole  interaction  between  the  instantaneous  transi¬ 
tion  dipoles  on  each  atom, 

(b)  A  superposition  of  pair  interactions  corresponding  to  short 
range  repulsions  and  whatever  residual  high  order  multipolar  inter¬ 
actions  may  exist  in  the  system. 

The  system  just  described  has  two  features  of  interest.  First,  die 
equations  of  motion  for  the  amplitudes  of  individual  atomic  di¬ 
poles,  including  the  coupling  terms,  are  linear  when  represented  in 
the  complete  phase  space  of  the  system.  Second,  die  range  of  the 
transition  dipole-transition  dipole  interaction  is  very  large  relative 
to  the  range  of  the  other  interactions  between  atoms,  and  this  in¬ 
teraction  is  also  weak  relative  to  the  strength  of  the  other  inter¬ 
actions.  The  first  observation  suggests  diat  the  dipole  amplitudes 
oscillate  harmonically.  The  second  observation  enables  Nicolis  and 
Rice  to  calculate  the  dispersion  relation  for  the  system  in  terms  of 
an  expansion  in  which  the  parameter  is  the  ratio  of  the  ranges  of 
the  short  and  long  range  interactions.  This  parameter  is  very  small 
in  the  case  considered.  It  is  found  that  the  spectrum  has  transverse 
and  longitudinal  modes  and  that  the  dispersion  relation  has  the 
following  interesting  properties: 

(a)  The  two  transverse  branches  of  the  dispersion  relation  differ 
only  through  the  difference  between  the  uiqrerturbed  modes, 
<**»»'*  <** 

(b)  When  the  wave  vector  k  is  small,  the  change  in  spectrum  in¬ 
troduced  by  the  dipole-dipole  coupling  is  parabolic  in  k- 

(c)  There  is  a  gap  in  the  spectrum  of  transverse  polarization 
waves  at  k  =  0,  with  the  magnitude  of  the  gap  dependent  on  the 
magnitude  of  the  dipolt-dijiole  coupling. 

(d)  In  the  limit  t<  j  k  |  »  <  *  (ois  a  molecular  diameter),  there 
is  a  change  in  sign  of  the  frequency  shift  arising  from  dipole-dipole 
coupling  and  the  frequency  shifts  of  tlte  longitudinal  and  transverse 
modes  are  of  opjmstte  sign.  The  change  in  sign  results  from  a  rever¬ 
sal  of  the  orientation  of  the  dipoles  which  are  near  neighbors  to 
any  selected  dipole  when  \  k  I  passes  into  the  indicated  range  from 
larger  to  smaller  values  of  [  k  {. 

(r)  An  examination  of  the  imaginary  part  of  the  dielectric  func¬ 
tion  show*  t'>at  the  longitudinal  polarization  waves  arc  damped. 
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but  that  in  tiic  limit  k  0  the  damping  is  small.  The  damping 
arises  from  phase  mixing  which  leads  to  instability  of  the  collective 
polarization  wave  relative  to  the  independent  dipole  oscillations, 
and  is  tlte  analogue  of  Landau  damping  in  a  plasma.*3  Thus,  in  die 
absence  of  real  scattering,  we  are  justified  in  considering  the  polari- 
zation  waves  to  lie  quasi-normal  modes  when  k  -»  0,  but  because 
the  Landau  damping  increases  greatly  as  |  k  |  -»  a~l  this  interpreta¬ 
tion  would  not  be  valid  as  |  k  I  — »  a-'.  Since  optical  excitation  in 
the  visible  and  ultraviolet  regions  corresponds  to  [k  j  a  <<  L  we 
conclude  that  in  the  absence  of  real  scattering  events  in  the  liquid, 
excitons  can  exist  despite  the  disorder  in  the  liquid  phase. 

The  limitations  and  approximations  of  the  theory  of  Nicolis  and 
Rice  resulting  from  the  simple  model  used  are,  with  one  exception, 
of  secondary'  iuqrortancc.  The  one  exception  is  the  neglect  of  all 
scattering  events.  Although  it  has  been  shown  that  in  the  long  wave¬ 
length  limit  there  exists  a  damping  of  the  polarization  waves  aris¬ 
ing  from  phase  mixing,  it  is  obvious  that  there  must  exist  other, 
more  efficient  damping  mechanisms.  For  the  case  of  electronic  excit¬ 
ation  these  will  arise  from  the  scattering  e»  the  excited  electron  by 
the  fields  of  the  surrounding  atoms,  and  are  analogous  to  electron- 
phonon  coupling  in  the  crystalline  solid.  We,  therefore,  expect  there¬ 
to  he  a  shift  and  broadening  of  the  exciton  sjiutrom  -.alto luted  for 
the  case  of  no  scattering,  before  examining  the  limited  experi¬ 
mental  data  we  consider  this  problem. 

b.  Tin:  Sun  r  ami  IbmstnNiNu  or  m  Kxr.ti  atws  Swemw 

The  simplest  description  of  the  effects  of  scattering  is  in  terms 
of  the  hydrogenie  model  discussed  at  the  beginning  of  Section  .la. 
The  argument  leading  to  the  conclusion  that  bound  states  exist  is 
of  considerable  generality,  relying  only  on  the  existence  of  plane 
wave  state*  for  lire  fret  electron.  l  ire  bound  states  of  the  core-elec¬ 
tron  jrair  will  he  reasonably  well  defined  il  tlte  mean  free  path  of 
the  electron  is  significantly  larger  titan  the  orbital  circumference. 
Scattering  of  tlte  orbital  electron  by  the  atoms  of  the  liquid  causes 
a  decrease  in  the  luettnte  of  any  given  state  and.  in  tlte  limit  that 
the  scattering  is  so  frequent  that  an  orbit  cannot  lie  closed,  no  bound 
state  can  exist.  This  limiting  case  is,  of  course,  inconsistent  with  the 
assumption  that  tlte  free  electron  is  in  a  plane  wave  si  te. 

Let  us  assume  dial  tlte  scattering  of  the  electron  is  sufficiently  weak 
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that  bound  slates  do  exist.  We  propose  to  compute  the  line  width 
by  setting1* 


A 


(4) 


where  the  relaxation  time,  t,  includes  the  effects  of  coherence  be¬ 
tween  the  scattering  amplitudes  from  different  centers.  Rice  and 
Jortncr1**  assume  (4)  to  be  valid,  and  use  for  t  the  relaxation  time 
for  momentum  transfer  considered  in  Section  2a.  When  the  scatter¬ 
ing  potential  is  represented  in  terms  of  the  zero  energy  scattering 
length,  a,  they  find 


I  _  ph-a-k 
t  (k)  ”  »» 


S(k) 


<*> 


Because  the  magnitude  of  the  electron  wave  vector  in  a  bound  state 
is  related  to  the  orbital  velocity  by  |  k  |  —  m  |  v  |  /  h.  we  conclude 
tliat  the  scattering  time  is  inversely  jrrojnn  tional  to  the  sjiecd  of  the 
election.  Thus,  the  electron  makes  fewer  collisions  in  the  tightly 
bound  states,  and  the  lifetimes  of  the  states  should  decrease  as  the 
|Mtuci|ial  quantum  number  increases. 

An  immediate  test  of  (4)  and  (a)  is  possible  using  the  reflection 
spectrum  of  liquid  Xc  measured  by  Beaglehule.-4  From  the  avail 
able  electron-atom  cross  sections  we  are  led  to  the  prediction  that 
hydrogen-like  levels  will  be  broadened,  in  liquid  Xe.  by  about  0.1 
eV.  Since  the  spacing  between  levels  is  only  about  0.2  eV.*a  the  set 
of  exciton  levels  will  appear  as  an  unresolved  s|tectrum.  As  seen  in 
Fig.  17  this  is  iti  agreement  with  what  is  observed-the  total  width  of 
the  excitou  manifold  is  about  the  same  in  liquid  amt  solid,  hut  reso¬ 
lution  of  the  level  structure  in  the  liquid  is  not  jmssible. 

Although  the  preceding  arguments  are  informative,  they  do  not 
tome  to  grips  with  the  fundamental  problems  which  arise  in  de¬ 
scribing  the  influence  of  scattering  events  mi  the  exciton  spectrum. 
These  ditto ulties  are  of  two  kinds: 

(t.)  It  is  necessary  to  describe  dynamical  processes  in  a  liquid,  for 
which  die  simplifications  provided  by  the  binary  collision  approxi¬ 
mation  (valid  in  a  dilute  gas)  or  the  simple  excitoo-plionoo  linear 
interaction  (valid  in  a  class  of  crystals)  are  not  available. 

(b)  There  is  coupling  between  the  resonant  interactions  and  non- 
rcsotiant  interactions  in  die  medium. 
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At  present  no  fully  satisfactory  solution  to  these  problems  is  avail¬ 
able,  but  there  do  exist  approximate  calculations  that  suggest  which 
physical  phenomena  are  of  importance.  1’opielawski  and  Rice7*  have 
studied  an  approximation,  based  on  *he  Fano  theory  of  line  broad¬ 
ening,1  to  describe  the  shift  and  broadening  of  impurity  spectra  in 
a  liquid.  In  their  study  of  the  impurity  spectrum  Popiclawski  and 
Rice  assume  that: 

(a)  The  internal  states  of  the  perturbing  host  molecules  do  not 
influence  the  internal  states  of  the  guest  molecule, 

(h)  The  translational  motion-internal  state  coupling  may  be  rep¬ 
resented  as  a  sunt  of  pair  interactions, 

(c)  The  pair  interaction  may  Ire  meaningfully  (even  ii  formally) 
separated  into  a  strong  short  range  component  and  a  weak  long 
range  conqMment, 

(d)  Guest-guest  interactions  may  ire  neglected. 

(e)  The  initial  state  of  the  system  may  Ire  represented  by  the  prod¬ 
uct  of  the  density  matrix  for  the  interna!  state  of  a  guest  molecule 
and  the  density  matrix  fur  the  translational  states  of  all  molecule- 
ill  the  liquid. 

(0  The  strong  short  ranged  conqronent  of  the  intti  molecular 
force  leads  to  dynamical  events  which  may  he  described  by  a  modi¬ 
fied  t-matrix  (binary  collision)  cx|iamion. 

(g)  The  weak  long  ranged  conqronent  of  the  imermHecular  force 
leads  to  dynamical  events  which  may  be  described  by  a  weak  coup 
ling  expansion. 

The  analysis  is  designed  to  provide  an  approximate  representation 
of  the  line  shape  function  descriptive  of  transitions  luealim*  on  an 
impurity  molecule  present  in  a  simple  liquid.  The  general  jrhysieal 
picture  which  emerges  from  the  analysis  is  the  following:  the  in¬ 
ternal  states  of  an  impurity  molecule  in  a  simple  liquid  are  in- 
lluemctl  by  a  mean  field  arising  from  the  super  posed  long  range 
couqwments  of  the  guest-host  interactions  of  many  molecules,  and 
also  by  quasi  biuary  encounters  arising  from  the  near  approach  of 
one  guest-host  pair  moving  in  the  fluctuating  force  field  of  all  the 
other  molecules.  In  the  quasi-binary  encounters  die  important  com* 
(*ment  of  tbc  interaction  is  short  ranged,  ft  is  important  to  note 
that  I  with  the  mean  field  effect  and  the  quasi-binary  encounters  ate 
defined  so  as  to  include  the  structure  of  the  liquid  in  the  local 
equilibrium  approximation.  The  quasi -binary  collision  term  con- 
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tains  the  influence  of  all  sequences  of  successive  dynamically  uncor¬ 
related  encounters.  A  discussion  of  the  implications  of  the  assump¬ 
tions  cited,  and  a  description  of  the  relationship  between  the  Pop- 
ielawski-Rice  description  and  the  kinetic  theory  of  liquids27  may  be 
found  in  reference  26. 

Although  the  preceding  arguments  lead  to  a  simple  description 
of  the  relationship  between  the  line  shape  function  and  the  inter- 
molecular  interactions  for  the  case  of  impurity  spectra,  they  are  not 
readily  extended  to  the  case  where  resonant  interactions  are  im¬ 
portant,  i.e.,  the  pure  liquid.  Even  in  the  absence  of  overlap  of  the 
electronic  wave  functions  of  neighboring  molecules,  the  mixing  of 
effects  arising  from  resonant  and  nonresonant  interactions  leads  to 
complicated  coupling  phenomena.  Considerable  insight  into  the 
general  structure  of  the  theory'  can  be  obtained  from  an  examination 
of  a  simple  model.  Though  the  properties  of  the  model  considered 
depart  in  detail  from  the  properties  of  real  systems,  the  most  im¬ 
portant  consequences  of  the  mixing  of  resonant  and  nonresonant 
interactions  can  be  elucidated  and  the  complications  attendant  to 
r  complete  (but  as  yet  unavailable)  analysis  avoided. 

Using  a  kinetic  equation  approach,  Nicholis  and  Rice2S  have  ex¬ 
tended  the  previously  described  analysis  of  the  spectrum  of  polariza¬ 
tion  waves  in  a  liquid  of  Drude  molecules.  As  in  the  Popielaw-ki- 
Rice  treatment,26  the  interactions  between  the  molecules  arc  separ¬ 
ated  into  components,  in  thu  case  a  short-ranged  repulsive  inter¬ 
action,  a  resonance  interaction  represented  as  a  sum  of  transition-di¬ 
pole-transition  dipole  couplings,  and  a  residual  soft  (multipolar) 
interaction.  In  the  case  of  an  impurity  spectrum  the  derived  kinet¬ 
ic  equation  displays  the  effects  of  scattering  from  the  short  ranged 
repulsions  in  a  modified  Enskog  kernel  and  from  the  residual  soft 
potential  in  a  Fokker-Planck-like  term.27  It  is  found  that,  even 
when  the  short  ranged  repulsions  arc  represented  as  hard  core  inter¬ 
actions,  there  is  both  a  shift  and  a  broadening  of  the  spectrum.  The 
average  soft  field  contributes  only  a  shift  in  the  oscillator  frequency, 
and  fluctuations  about  the  mean  field  lead,  again,  to  both  a  shift 
and  broadening  of  the  spectrum.  These  deductions  are  in  agree¬ 
ment  with  those  of  Popiclawski  and  Rice. 

Of  most  interest,  however,  is  the  influence  of  the  resonance  coup¬ 
ling  in  the  pure  liquid,  which  was  not  treated  by  Popiclawski  and 
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Rice.  It  is  found  that  the  dipole-dipole  coupling  of  a  pair  of  mole¬ 
cules  is  shielded  by  the  presence  in  the  liquid  of  other  molecules 
which  are  also  coupled  to  the  pair  by  dipolar  interactions.  The 
shielding  depends  on  both  the  dipolar  interactions  and  the  short 
range  order  in  the  liquid,  as  well  as  the  internal  state  of  the  mole¬ 
cules.  Further,  the  kinetic  equation  contains  a  term  which  plays  the 
role  of  a  friction  coefficient.  The  evolution  of  the  internal  states  of 
a  molecule  is  now  described  by  an  operator  which  includes:  (a)  all 
dynamical  events  corresponding  to  interactions  in  the  continuum 
states  lying  above  the  bound  states  that  take  part  in  energy  transfer, 
(b)  the  scattering  of  molecules  in  excited  bound  states,  in  general 
with  excitation  transfer,  and  (c)  the  formation  of  new  bound  states. 
The  events  described  by  (a),  wherein  all  bound  states  remain  un¬ 
changed  and  there  is  a  modification  of  the  continuum  states  arising 
from  scattering,  correspond  to  the  limit  in  which  the  internal  de¬ 
grees  of  freedom  of  the  molecule  are  organized  into  Frenkel-type 
excitation  waves,  i.e.,  although  collective  states  of  the  liquid  exist 
they  have  one  to  one  parentage  in  the  states  of  the  free  molecule. 
In  contrast,  (c)  describes  the  formation  of  Wannier  excitons,  and 
(b)  an  intermediate  case.-”  Although  the  theory,  in  principle,  per¬ 
mits  passage  from  the  tight  binding  Frenkel  limit  to  the  weak  bind¬ 
ing  Wannier  limit,  the  neglect  of  overlap  and  exchange  effects  limits 
the  accuracy  of  the  description  of  the  transition.  For  this  reason,  Ni- 
colis  and  Rice  confine  attention  to  the  Frenkel  limit.  In  this  limit, 
A  is  shown  that  the  resonance  interaction  leads  again  to  both  a 
shift  and  broadening  of  the  spectrum  of  polarization  waves.  The 
dispersion  relation  still  displays  both  longitudinal  and  transverse 
branches,  and  a  preliminary  estimate  of  the  efficiency  of  resonance 
coupling  versus  the  internal  state-translational  motion  coupling  sug¬ 
gests  that  the  friction  associated  with  the  resonance  coupling  is 
smaller  than  the  friction  associated  with  internal  state-translational 
motion  coupling.  The  theory  described  is  too  general  and  involves 
too  many  approximations  to  make  any  but  simple  deductions  of 
the  type  described. 

The  experimental  data  available  are  inadequate  to  test  even  the 
limited  theory  described.  As  previously  mentioned,  Beaglehole  has 
studied  the  reflection  spectrum  of  liquid  Xe  and  compared  this 
with  the  spectrum  of  solid  Xe.  There  is  evidence  for  a  collective 
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excitation  in  the  liquid  corresponding  to  the  known  Wannier-like 
excitations  of  the  solid  (see  Fig.  17)  and  the  inferred  level  broaden¬ 
ing  is  about  that  predicted  by  the  simple  theory.  More  recently, 
Jortner  and  co-workers30  have  studied  the  absorption  spectrum  of 
Xe  in  Ar  and  have  also  shown  that  the  broadening  of  the  impurity 
spectrum  is  in  agreement  with  predictions  of  the  simple  theory. 

Given  that  exciton  states  do  exist  in  a  liquid,  i.e.,  that  molecule 
scattering  events  do  not  so  shorten  the  lifetime  of  collective  excita¬ 
tions  that  these  cease  to  be  meaningful  in  the  description  of  the 
liquid,  it  is  to  be  expected  that  energy  can  be  transferred  over  long 
distances.  There  exist,  at  present,  no  detailed  studies  of  energy 
transfer  in  simple  liquids.  Meyer,  Jortner,  Rice,  and  Wilson31  have 
studied  the  consequences  of  excit'ng  liquid  He,  Ne,  Ar,  Kr,  and 
Xe  with  a  particles.  The  emission  spectra  of  all  the  liquids  are  red 
shifted  (from  2  eV  to  6  eV)  and,  by  comparison  with  known  gas 
spectra,  the  emitting  species  is  identified  in  each  case  as  the  excimer 
Ne2#,  Arz#,  ....  (see  Fig.  18).  The  binding  energy  of  the  He2*  is 
2.6  eV,32  and  of  the  other  excimers  presumably  less,  but  enough  to 
promote  a  change  in  local  liquid  structure  resulting  in  trapping  of 
the  excitation  energy.  If  the  lifetime  of  the  molecular  excited  state 
is  long  relative  to  the  time  required  for  molecular  displacement, 
and  if  an  excimer  can  be  formed,  it  seems  likely  that  excitation 
energy  can  be  self-trapped  with  high  efficiency.  Clearly,  the  situation 
in  which  self-trapping  is  generated  by  exciton-fluid  coupling  is  anal¬ 
ogous  to  the  electron  self-trapping  in  liquid  He.  Just  as  the  excess 
electron  states  of  He  and  Ar  are  fundamentally  different,  corre¬ 
sponding  to  trapped  and  free  electrons,  we  must  expect  to  find  liq¬ 
uids  in  which  energy  transfer  does  occur  w;th  ease.  Each  case  must 
be  examined  separately. 

Of  course,  if  the  excimer  species  is  long  lived,  it  can  serve  as  the 
carrier  of  energy.  Indeed,  a  phenomenon  attributable  to  energy 
transfer  via  intermediacy  of  He2*  was  discovered  by  Meyer,  Jortner, 
Wilson,  and  Rice.33  When  liquid  He  was  doped  with  N2  and  02 
(present  as  small  solid  particles)  the  emission  spectrum  of  the  a-par- 
ticle  irradiated  liquid  arise  from  the  transitions  A3  Sif  -»  X1  Sjf- 
and  C3  S;f-»X32g  of  N2  and  02,  respectively  (See  Fig.  19).  These 
are  the  transitions  that  would  be  excited  by  triplet  He2*.  and  since 
the  lowest  triplet  state  of  He  has  a  very  long  lifetime  (many  sec- 
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onds)  it  is  reasonable  to  suppose  that  diffusion  of  He2*  can  serve 
to  transfer  electronic  excitation  energy  over  long  distances  in  liq¬ 
uid  He.  It  should  be  noted  that  we  cannot,  at  present,  rule  out 
atom  interchange-energy  exchange, :i 

(HE.  Heb)*  -f  He,.—*  (Hea  He,.)*  +  Heb, 

or  other  mechanisms  of  energy  transfer.  Much  more  experimental 
work  will  be  required  before  energy  transfer  in  liquids  can  even  be 
outlined  for  detailed  study,  let  alone  interpreted  completely. 

It  is  clear  that  there  remain  many  major  problems  in  the  de¬ 
scription  of  the  exciton  states  of  a  liquid.  These  include: 

(a)  Development  of  a  formalism  that  allows  the  treatment  of 
overlap  and  charge  transfer  phenomena, 

(b)  Development  of  an  understanding  of  the  relationship  be¬ 
tween  scattering  processes  and  excimer  formation, 

(c)  Accumulation  of  a  body  of  experimental  data  with  which  to 
test  the  ideas  thus  far  put  forward  and  to  guide  the  development  of 
an  improved  theory, 

(d)  Development  of  an  understanding  of  the  relationship  be¬ 
tween  the  transition  from  localized  to  delocalized  excitation  states 
and  the  nature  of  energy  transfer  in  the  liquid, 

(e)  Development  of  a  more  realistic  theory  in  which  the  simplifi¬ 
cations  of  the  Drude  model  of  the  molecule  and  other  approxima¬ 
tions  are  removed. 

It  is  my  opinion  that  we  are  at  the  threshold  of  a  vast  expansion 
of  our  understanding  of  the  electronic  properties  of  liquids  and 
other  disordered  systems.  So  little  is  known,  either  from  experiment 
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or  theory  that  almost  any  effort  is  likely  to  be  rewarded  with  un¬ 
expected  results.  This  field,  unlike  others,  requires,  more  than  any¬ 
thing  else,  new  concepts  and  constructs  and  novel  ways  of  interpret¬ 
ing  strongly  coupled  phenomena.  For  some  time  it  is  likely  that 
only  qualitative  interpretations  of  data  will  be  possible,  but  with 
the  steady  accumulation  of  information  and  the  creation  of  new 
interpretations  we  may  look  forward  to  the  development  of  a 
quantitative  theory  encompassing  in  its  scope  the  description  of  the 
properties  of  both  ordered  and  disordered  systems.  Given  that  many 
systems  of  interest,  including  essentially  all  biological  systems,  are 
disordered  to  some  extent,  the  importance  of  a  deeper,  broader, 
and  more  comprehensive  description  of  the  electronic  properties  of 
disordered  systems  cannot  be  over-estimated. 
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